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* Prior Application: 

•tf Art Unit: 1632 

o Exarniner: S. Priebe + ®^ 
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Assistant Cornrnissioner for Patents 
Washington, DC 20231 

Sir: 

This is a Request for filing a Continuation-in-Part application under 37 CFR 1.53(b) of pending prior 
application Serial No. 08/776,274 , filed on January 24, 1997 , entitled DNA ENCODING OVINE ADENOVIRUS 
(OAV287) AND ITS USE AS A VIRAL VECTOR , by the following named inventor(s): SUDHANSHU VRATI, 
GERALD WAYNE BOTH, DAVID BERNARD BOYLE. 

1 . ^ I hereby state that the enclosed copy of this prior application is a true copy of the above-identified prior 

application. 

2 . Oath or Declaration 

a. O Newly executed (original or copy) 

b. O Copy from a prior application (37 CFR 1 .63(d)) 
i. Q Deletion of inventor(s) 

Signed statement attached deleting inventor(s) named in the prior application, see 37 
CFR 1.63(d)(2) and 1.33(b). 

3. EH Incorporation By Reference (useable if Box 2b is checked) 

The entire disclosure of the prior application, from which a copy of the oath or declaration is supplied 
under Box 2b, is considered as being part of the disclosure of the accompanying application and is hereby 
incorporated by reference therein. 

4. £3 Preliminary Amendment is enclosed. 

5. □ An Information Disclosure Statement and PT01449 Form are submitted herewith. 

6. O Cancel claims . 
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J3\ The filing fee is calculated on the basis of the claims existing in the prior application as amended at 2 and 3 
above: 





NO. OF 
CLAIMS 




EXTRA 
CLAIMS 


RATE 


AMOUNT 


Total Claims 


24 


-20 


4 


$18.00 = 


$72.00 


Independent Claims 


6 


-3 


3 


$78.00 = 


$234.00 


Basic Application Fee 


$760.00 


If multiple dependent claims are presented, add $0.00 


$0.00 


Total Application Fee 


$760.00 


Subtract Vz if small entity 


$0.00 


TOTAL APPLICATION FEE DUE 


$760.00 


|lg AMOUNT TO BE CHARG 







7a. O Enclosed is a Verified Statement to establish small entity status under 37 CFR 1.9 and 37 CFR 1.27. 

7b. □ A verified Statement to establish small entity status under 37 CFR 1 .9 and 37 CFR 1 .27 was filed in 
prior application and such status is still proper and desired. *@ 

8a. £3 PLEASE CHARGE DEPOSIT ACCOUNT 500417 in the amount of $$26fc0fr 

8b. ^ The Commissioner is hereby authorized to charge fees under 37 CFR 1.16 and 1.17 which may be 

required, including any extension of time fees to maintain the pendency of the parent application Serial 
No. 08/776,274 or credit any overpayment to Deposit Account No. 500417 . 

9. £3 Amend the specification by inserting before the first line the sentence: 

~ This application is a Continuation-in-Part of Serial No. 08/776,274, filed January 24, 1997 as the 
National Phase of PCT Application No. PCT/AU95/00453, filed July 26, 1995 and claiming priority to Australian 
Application No. PM7101, filed July 26, 1994.- 
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11. 



12. 



13. 



Priority of Application Serial No.PCT/AU95/00453, filed July 26, 1995 and Australian Application 
No. PM7101, filed July 26, 1994 are claimed under 35 USC 119. The certified priority document(s) 
were filed in Serial No. 08/776,274 on July 26, 1994. 

The prior application is assigned of record to 

Commonwealth Scientific and Industrial Research Organisation 

Parkville, Victoria, Australia 

The power of attorney in the prior application is to: 

McDermott, Will & Emery 

Also enclosed: 

32 pages of Drawings 
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►14. □ A petition, fee and response has been filed to extend the term in the pending prior application until . 

Address all future communications to: (May only be completed by applicant, or attorney or agent of record) 

McDermott, Will & Emery 
600 13th Street, N.W.. 
Washington, DC 20005-3096 

Respectfully submitted, 

MCDERMOTT, WILL & EMERY 




Robert L. Price 
Registration No. 22,685 



600 13 th Street, N.W. 
Washington, DC 20005-3096 
(202)756-8000 RLP:ajb 
Date: December 16, 1999 
Facsimile: (202)756-8087 



WDC99 192602-1.050179.0073 



Docket No.: 50179-073 PATENT 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Application of 



GERALD WAYNE BOTH, et al. 
CIP of Serial No.: 08/776,274 
Filed: On even date herewith 



Group Art Unit: 1632 
Examiner: S. Priebe 



For: DNA ENCODING OVINE ADENOVIRUS (OAV287) AND ITS USE AS A 
VIRAL VECTOR 



PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, DC 20231 

Sir: 

Prior to examination of the application, please amend the application as follows: 
TN THE SPECIFICATION : 

Page 1, after the title insert -This application is a Continuation-in-Part of Serial 
No. 08/776,274, filed January 24, 1997 as the National Phase of PCT Application No. 
PCT/AU95/00453, filed July 26, 1995 and claiming priority to Australian Application 
No. PM7101, filed July 26, 1994.-- 

Page 8, after line 27 and before "DESCRIPTION OF THE INVENTION" insert - 
Figure 13 is a modified nucleic acid sequence of the OAV287 genome beginning at base 
1 of the left hand ITR-. 

Page 10, after line 15, insert the following: 

-In this specification, the term "substantially" means a sequence which will 
hybridize to the specified sequence under conditions of high stringency. 
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When used herein, "high stringency" refers to conditions that: 

(i) employ low ionic strength and high temperature for washing after 
hybridization, for example, 0.1 x SSC and 0.1% (w/v) SDS at 50°C; 

(ii) employ during hybridization conditions such that the hybridization 
temperature is 250°C lower than the duplex melting temperature of the hybridizing 
polynucleotides, for example 1.5 x SSPE, 10% (w/v) polyethylene glycol 6000 
(Amasino, 1986), 7% (w/v) SDS (Church, 1984), 0.25 mg/ml fragmented herring sperm 
DNA at 65°C; or (iii) for example, 0.5M sodium phosphate, pH 7.2, 5mM EDTA, 7% 
(w/v) SDS (Church, 1984) and 0.5% (w/v) BLOTTO (Johnson, 1984; Reed, 1985) at 
70°C; or (iv) employ during hybridization a denaturing agent such as formamide (Casey, 
1977), for example, 50% (v/v) formamide with 5 x SSC, 50mM sodium phosphate (pH 
6.5) and 5 x Denhardt's solution (Denhardt, 1966) at 42°C; or (v) employ, for example, 
50% (v/v) formamide, 5 x SSC, 50mM sodium phosphate (pH 6.8), 0.1% (w/v) sodium 
pyrophosphate, 5 x Denhardt's solution (Denhardt, 1966), sonicated salmon sperm DNA 
(50 5g/ml) and 10% dextran sulphate (Wahl, 1979) at 42°C. See generally references 
Meinkoth, 1984; Reed, 1991; Dyson, 1991. 

In a preferred embodiment, the polynucleotide sequences of the present invention 
share at least 60% identity, more preferably at least 80% identity, more preferably at least 
90% identity and more preferably at least 95% identity with a sequence set out 
in Figure 1 or Figure 13, wherein the identity is calculated by the BLAST program blastn 
as described in Altschul et al (1997). 
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REMARKS 



This application is amended to add additional subject matter to the specification, 
to delete the multiple dependency of claims 8, 9, 1 1, 15, 17, 18, 23 and 24 to avoid the 
multiple dependent claim filing fee and to add new claim 24. 

To the extent necessary, a petition for an extension of time under 37 C.F.R. 1 . 1 36 is 
hereby made. Please charge any shortage in fees due in connection with the filing of this 
paper, including extension of time fees, to Deposit Account 500417 and please credit any 
excess fees to such deposit account. 



600 13 tn Street, N.W. 
Washington, DC 20005-3096 
(202) 756-8000 RLP:ajb 
Date: December 16, 1999 
Facsimile: (202) 756-8087 



Respectfully submitted, 



MCDERMOTT, WILL & EMERY 




Robert L. Price 
Registration No. 22,685 
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IN THE CLAIMS : 

Please amend the claims as follows: 

Claim 8, line 30, change "any one of claims 1 to 7" to -claim 1- 
Claim 9, line 32, change "any one of claims 1 to 30" to -claim 1--. 
Claim 11, line 4, delete "or 10". 

Claim 15, line 24, change "any one of claims 12 to 14" to -claim 12-. 
Claim 17, line 2, change "any one of claims 12 to 16" to -claim 12-. 

Please add new claim 24: 

-24. An isolated DNA molecule comprising a nucleotide sequence of plasmid 
pOAVlOO, the DNA molecule having the sequence set forth in Figure 13, or a 
functionally equivalent nucleic acid sequence.-- 
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REMARKS 



This application is amended to add additional subject matter to the specification, 
to delete the multiple dependency of claims 8, 9, 1 1, 15, 17, 18, 23 and 24 to avoid the 
multiple dependent claim filing fee and to add new claim 24. 

To the extent necessary, a petition for an extension of time under 37 C.F.R. 1.136 is 
hereby made. Please charge any shortage in fees due in connection with the filing of this 
paper, including extension of time fees, to Deposit Account 500417 and please credit any 
excess fees to such deposit account. 



600 13 m Street, N.W. 
Washington, DC 20005-3096 
(202) 756-8000 RLP:ajb 
Date: December 16, 1999 
Facsimile: (202) 756-8087 



Respectfully submitted, 



MCDERMOTT, WILL & EMERY 




Robert L. Price 
Registration No. 22,685 
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DNA encoding ovine adenovirus (OAV 287) and its use as a viral vector 
Technical vi*\d 

The present invention relates to a new full length 
genomic clone derived from a benign adenovirus (OAV287) 
isolated from sheep in Australia. The present invention 
also relates to new viral vectors derived from the benign 
ovine adenovirus and also relates to the use of these 
vectors for the delivery and expression of nucleic acid 
sequences encoding functional RNA molecules or 
polypeptides to animals . 
Backgro und of the Invention 

Diseases caused by infectious agents and parasite 
infestations cause health problems and production losses 
in domestic animals but for many infectious agents no 
15 vaccine exists. Consequently, there are major research 
efforts worldwide to develop new vaccines which can 
protect against disease. 

While some protective antigens from infectious 
agents and parasites have been identified, their 
20 successful use as vaccines requires the development of 

systems which can effectively deliver the antigen to the 
host. A variety of recombinant gene expression vectors 
derived principally from the pox virus family have been 
employed as these are generally of low pathogenicity. 
25 Expression of the foreign protein following infection by 
the recombinant viral vector may stimulate a protective 
immune response in the host. 

However, no viral vector has all the attributes 
desirable for all situations. Some vectors are better 
30 suited to particular tasks than others because of their 
biological properties. For example, it has often proved 
difficult to stimulate an effective mucosal immune 
response which can protect against disease. In humans, 
adenoviruses have been given orally to vaccinate against 
35 respiratory disease (1). As this involves protection at 
mucosal surfaces adenoviruses clearly have potential in 
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this regard. Human adenovirus vectors have also been used 
to deliver genes to muscle (2) and other tissues. 
Although adenoviruses do not generally integrate their DNA 
into the cellular genome, nevertheless, the DNA persists 
5 and long term protein expression is observed. Expression 
of an appropriate antigen from such cells can generate a 
systemic immune response which may be protective against 
the homologous disease causing agent. 

Known adenovirus genomes are linear double-stranded 

10 DNA molecules which have an inverted terminal repeat 

sequence (ITR) at each end and a protein covalently bound 
to the 5 '-terminal C residue (3). The genome sequence and 
structure has now been completely determined for human 
adenoviruses types 2, 5, 12 and 4 0 and partially for 

15 numerous others, including some animal isolates (see 
Genebank and EMBL Nucleic Acid databases). Human 
adenovirus type 2 was the first genome to be sequenced but 
broadly speaking its genome arrangement is conserved among 
other characterized adenoviruses i.e. early regions- E1-E4 

2 0 and the structural protein homologues can be recognized in 
similar locations in the genome. In particular, the 
E1A/E1B region is located at the left hand end of the 
genome and region E4 is always located at the right hand 
end of the genome. Early region E3 is always located 

25 between the genes for structural proteins pVIll and fiber, 
although its size and complexity varies between species 
e.g. from 3kb with at least 10 open reading frames in 
human adenoviruses to approximately 0.7kb with only two 
significant open reading frames in murine adenovirus {4, 

30 5). E3 is a key region for the construction of 

recombinant viruses as it is non-essential for replication 
in vitro (6). The late, L region is expressed from the 
major late promoter, MLP and complex splicing generates 
families of mRNAs which code for most of the structural 

35 viral proteins. Proteins IVa2 and IX appear to have their 
own promoters . 
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Although there are some human viral vectors 
available for medical use there are few animal viral 
vectors suitable for use in veterinary applications . in 
order to obtain a more suitable animal viral vector the 
5 present inventors have purified an ovine adenovirus 

(OAV287) isolated from sheep in Western Australia* This 
ovine adenovirus is serologically related to bovine 
adenovirus type 7 but is genetically distinct from the 
bovine adenoviruses and other Australian ovine isolates, 

10 as shown by comparisons between the ovine and bovine 

adenoviruses, based on restriction enzyme profiles (8). 
The genome arrangement of the virus according to the 
present invention varies significantly from all other 
known adenoviruses . The adenoviral DNA molecule of the 

15 present invention is suitable for use in viral vectors 

capable of expressing a variety of polypeptides when used 
for veterinary applications . 
Summary of the Invention 

According to a first aspect, the present invention 

2 0 consists in an isolated DNA molecule comprising a nucleic 
acid sequence encoding the genome of ovine adenovirus 
(OAV287) substantially as shown in Figure 1 or a 
functionally equivalent nucleic acid sequence. 
Preferably, the nucleic acid sequence encoding the genome 

25 of the adenovirus is substantially as shown in Figure 1. 

In a further preferred embodiment of the first 
aspect of the present invention, the DNA molecule 
comprises a nucleic acid sequence encoding the genome of 
ovine adenovirus (OAV287) wherein a portion of the 

30 adenoviral genome not essential for the maintenance or 
viability of the native adenovirus deleted or altered. 

In a second aspect, the present invention consists 
in a DNA molecule including at least a fifteen nucleic 
acid base sequence being substantially unique to the ovine 

35 adenovirus (OAV287) nucleic acid sequence shown in Figure 
1. In a preferred embodiment of the second aspect of the 
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present invention, the at least fifteen nucleic acid base 
sequence encodes a functional element of ovine adenovirus 
(OAV287). Preferably, the functional element is selected 
from the group consisting of promoter, gene, inverted 
5 terminal repeat, viral packaging signal and RNA processing 
signal. The inverted terminal repeat of ovine adenovirus 
(OAV287) comprises the first 4 6 nucleic acid bases from 
the 5' end of each strand of the double stranded DNA 
genome of the virus. 
10 In a third aspect, the present invention consists in 

a plasmid including the DNA molecule of the first or 
second aspects of the present invention. Preferably, the 
plasmid includes the DNA molecule of the first aspect of 
the present invention wherein the nucleic acid sequence 
15 encoding the adenoviral genome is linked to a nucleic acid 
sequence encoding an origin of replication and a further ~~ 
nucleic acid encoding a marker. Preferably, the nucleic 
acid sequence encoding the marker encodes for resistance 
to an antimicrobial agent. More preferably the 
20 antimicrobial agent is ampicillin. 

In a further preferred embodiment of the third 
aspect of the present invention, sequences encoding 
inverted terminal repeats of the adenovirus are joined. 

In a fourth aspect, the present invention consists 
25 in a viral vector comprising the DNA molecule of the first 
aspect of the present invention and at least one nucleic 
acid sequence encoding a non-adenoviral polypeptide or 
polypeptides . 

Preferably, nucleic acid sequence encoding the non- 
30 adenoviral polypeptide or polypeptides is derived from 
bacteria, viruses, parasites or eukaryotes . More 
preferably, the non-adenoviral polypeptide is rotavirus 
VP7sc antigen, the parasite polypeptide is 
Trichostrongylus colubriformls 17kD antigen, the 
35 Taenia ovis 45W antigen or the PM95 antigen from Lucllla 
cuprina . 
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In another form, the present invention consists in a 
viral vector comprising the DNA molecule of the first 
aspect of the present invention and at least one nucleic 
acid sequence encoding a functional RNA molecule. It will 
5 be appreciated by one skilled in the art that a functional 
RNA molecule can include a messenger RNA molecule, an 
antisense RNA molecule or a ribozyme. 

In a fifth aspect, the present invention consists in 
a method of delivering a DNA molecule having a nucleic 

10 acid sequence encoding a non-adenoviral polypeptide or 
polypeptides to a target cell comprising infecting the 
target cell with a viral vector according to the fourth 
aspect of the present invention such that the DNA molecule 
encoding the polypeptide or polypeptides is expressed and 

15 the polypeptide or polypeptides is produced by the target 
cell. 

In a sixth aspect, the present invention consists in 
a method for delivering a DNA molecule having a nucleic 
acid sequence encoding a non-adenoviral polypeptide or 

20 polypeptides to an animal comprising administering to the 
animal a viral vector according to the fourth aspect of 
the present invention such that the viral vector infects 
at least one cell of the animal and the infected cell 
expresses the DNA molecule encoding the polypeptide or 

25 polypeptides and produces the polypeptide or polypeptides. 
Preferably the animal is a grazing animal and more 
preferably the grazing animal is a sheep. 

In another form, the present invention consists in a 
method for delivering a DNA molecule having a nucleic acid 

30 sequence encoding a functional RNA molecule to an animal 
comprising administering to the animal a viral vector of 
the fourth aspect of the present invention having a 
nucleic acid sequence encoding a functional RNA molecule 
such that the viral vector infects at least one cell of 

35 the animal and the infected cell expresses the DNA 
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molecule encoding the functional RNA molecule and produces 
the functional RNA molecule. 

As used herein the term "functionally equivalent 
nucleic acid sequence" is intended to cover minor 
5 variations in the ovine adenovirus (OAV287) DNA molecule 
which, due to degeneracy in the DNA code, does not result 
in the molecule encoding different viral polypeptides. 
Further, this term is intended to cover alterations in the 
DNA code which lead to changes in the encoded 
10 polypeptides, but in which such changes do not 

substantially affect the biological activities of these 
viral polypeptides . 

As used herein the term "functional element" is 
intended to cover nucleic acid sequences that encode 
15 promoters, genes, inverted terminal repeats, viral 

packaging signals and RNA processing signals. It will be~~ 
appreciated by one skilled in the art that unique 
sequences from ovine adenovirus (OAV2 87) that encode these 
functional elements may be useful in other systems 
2 0 including plasmids and non-ovine adenoviral vectors. 

In order that the nature of the present invention 
may be more clearly understood preferred forms thereof 
will be described with reference to the following examples 
and the accompanying drawings . 
25 Brief Description of the Drawings 

Figure 1 is the nucleic acid sequence of the OAV287 
genome beginning at base 1 of the left-hand ITR. 

Figure 2 shows the arrangement of OAV2 8 7 genes based 
on homologies detected with Ad2 . Regions with question 
30 marks are tentative identifications because of the lack of 
obvious homology. 

Figure 3 indicates the major open reading frames in 
the proposed El region of OAV287. Asterisks show the 
location of possible initiation codons . A previously 
35 unidentified gene (p28kD) which codes for a processed 

structural protein is encoded on the complementary strand. 
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Figure 4 shows open reading frames in the region of 
the OAV287 expected to contain E3- However, E3 is missing 
as the gap between the pVTII and fiber genes is only 197 
nucleotides* The site at which the Apal/NotI polylinker 
5 was later inserted is indicated. 

Figure 5 shows the major open reading frames in the 
probable E3 region of OAV287. Asterisks show the location 
of potential initiation codons . The Sail site which was 
modified by end-filling and re-ligation and the 
10 alternative site at which a polylinker sequence was later 
inserted into the genome without loss of infect ivity is 
indicated . 

Figure 6 is a scheme describing the construction of 
a plasmid (pOAV2 87Cm) containing a full-length clone of 

15 the OAV287 genome with pACYC184 sequences inserted in the. 
Sail site. Filled in regions show OAV287 sequences. 
Cross-hatched sequences are derived from plasmids pUC13 or 
Bluescribe M13+ (Amp R ) , stippled regions from pSELECT 
(Tet R ) and open regions from pACYC184 (Cm R ) . Only t'he key 

20 restriction sites used for plasmid construction are 
indicated. 

Figure 7 shows a map of the plasmids pOAVlOO, 
pOAV200, pOAV600 and pOAVSOOS. Arrowheads indicate the 
ITRs and the approximate location of the major late 

25 promoter (MLP) * The mutated Sail site and sites at which 
the Apal/NotI polylinker sequences were inserted are 
indicated. Light hatching signifies modified Bluescribe 
sequences inserted in the Kpnl site. Linear, infectious 
genomes (dark hatching) are released by digestion with 

30 Kpnl. 

Figure 8 shows the results of screening ovine 
adenoviruses OAV100 and OAV200 rescued by transfection of 
recombinant plasmids pOAVlOO and pOAV200 into CSL503 
cells. Portions of the genome spanning (A) the mutated 
35 SphI site in OAV100 and (B) the Apal/EcoRV/NotI polylinker 
insertion site in OAV200 were amplified by PCR together 
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with the corresponding regions from wild-type OAV287. The 
products were digested with SphI (A, lanes 3 & 5) and 
Apal, EcoRV or NotI (B, lanes 3-5, and 8-10, 
respectively). (U) indicates undigested samples. 
5 Figure 9 is a map of a plasmid pMT used for the 

assembly of gene expression cassettes. Fragments 
containing the OAV2 87 major late promoter and tripartite 
leader sequences are linked and precede a multiple cloning 
site for the insertion of genes of interest. A tandem 

10 polyadenylation signal (AATAAA) follows. 

Figure 10 shows a summary of recombinant viruses 
which have been rescued from the corresponding infectious 
plasmids and the gene expression cassettes they carry. 
Cassettes were inserted into the OAV genome between the " 

15 pVIII and fibre genes as indicated. 

Figure 11 shows the expression of (A) the T. ovis 
45W and L . cuprina PM95 antigens in CSL503 cells following 
infection of these cells with OAV205 and OAV210 viruses, 
respectively and (B) VP7sc expression in CSL503 and, bovine 

20 nasal turbinate cells following infection with virus 

OAV204. (I) Infected cells (U) Uninfected cells. (M) 
indicates marker proteins of the sizes shown. 

Figure 12 shows expression of VP7sc in (A) CSL503 
cells and (B) rabbit kidney and bovine nasal turbinate 

25 cells following infection with OAV206 virus. (I) Infected 
cells. (U) uninfected cells. (M) indicates marker proteins 
of the sizes shown. 
Description of the Invention 
METHODS 

30 Growth and Purification of OAV287 

The virus, isolated from sheep in 1985, was obtained 
from R.L. Peet, Animal Health Laboratory, Department of 
Agriculture, Western Australia. The virus isolate was 
grown in sheep foetal lung cells (line CSL503) and twice 

35 plaque-purified under solid overlay before stocks were 
prepared. Virus was purified from CSL503 cells as 
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described previously (13, 22). DNA was extracted from the 
virus by digestion with proteinase K (23). 
Cloning of Genome Fragments 

Molecular techniques for manipulation, modification 
5 and transformation of plasmid DNA which were used in the 
work described below are described in (9) and similar 
publications. OAV287 DNA was digested with various 
restriction endonucleases including BamHI, SphI, Smal and 
Sail to deduce the location of these sites (18). 

10 The adenovirus genome has a protein covalently 

linked to each end of the linear dsDNA (24). The BamHI A 
and D fragments of approximately 8kb and 4kb, 
respectively, were identified as the terminal genomic 
fragments because their migration into agarose gels was 

15 dependent on the pre-digestion of viral DNA with 

proteinase K. The internal BamHI fragments B, C, E and 
estimated at 6.2, 5.1, 3.4 and 1 . lkb in size respectively, 
were separated on an agarose gel, recovered and cloned 
into BamHI-digested pUC13 using standard ligation and 

20 transformation procedures (9). To clone the terminal 
BamHI A and D fragments, viral DNA (10|ig) was digested 
with proteinase K (SOjag/ml in lOmM Tris/HCL, pHS.O, 
containing ImM EDTA and 0.5% SDS) at S5°C for 60min to 
remove the terminal protein. The DNA was extracted twice 

25 with phenol/chloroform, once with ether and recovered by 
ethanol precipitation. The 3 'ends (of unknown sequence) 
were then digested exo-nucleolytically with T4 DNA 
polymerase (5 units, Toyobo, Tokyo, Japan) in the presence 
of dATP (100(oH) in buffer containing Tris HCL (50mM), 

30 pH8.0, MgCl2 (7mjy£), 2 -mercaptoethanol (7mM) and BSA 

(lOug/ml) for ISmin at 37°C. The DNA was again purified 
by phenol extraction and ethanol precipitation described 
above. To remove the single-stranded terminal regions and 
create blunt ends the DNA was digested with 1 unit of mung 

35 bean nuclease (Pharmacia, North Ryde, Australia) for 10 

min at 37°C in buffer containing Na acetate (30mM), pH4 . 6 , 
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NaCl (5QmM) and ZnCl 2 ( ImM) before extraction with 
phenol/chloroform and recovery by ethanol precipitation. 
Finally the DNA was digested with BamHI (Pharmacia) and 
the fragments were separated by electrophoresis in low- 
5 melting-point agarose. The BamHI A and D fragments were 
excised, recovered by NACS column chromatography (BRL, 
Gaithersburg, Md) and ligated with BamHI /Hindi -cut 
plasmid Bluescribe M13 + (Stratagene, La Jolla, Ca) prior 
to transformation into E. coll JM109. Positive clones 
10 carrying fragments of the expected size were identified, 
restriction digested and confirmed as correct by 
nucleotide sequencing and comparison with partial sequence 
determined directly from genomic DNA . This revealed that 
three 3 '-terminal nucleotides were removed during the 
15 cloning procedure. 

Nucleotide Sequencing of the OAV287 Genome 

The complete sequence of the OAV2 87 genome was 
determined by sequencing the BamHI fragments A-F using the 
Sanger method (25) and various kits provided by commercial 
20 suppliers. Nested deletions were constructed for the five 
largest fragments using a double-stranded nested deletion 
kit (Pharmacia). These were sequenced using standard 
primers . Based on newly determined sequence other 
nucleotide primers were synthesised using a DNA 
25 synthesizer ( ABi , Model 391). In this way both strands of 
the entire genome and the junctions between the fragments 
were sequenced. 

Mutagenesis of the OAV2 8 7 genome 

For the construction of a full length OAV287 clone 

30 and subsequent modification of it to create plasmids such 
as pOAV2 00 and pOAVSOO certain mutations were required. A 
relevant portion of the genome was subcloned into 
Bluescribe (Stratagene, La Jolla, Ca) or a similar plasmid 
which allowed rescue of single stranded DNA. Later it 

35 became possible to use dsDNA for mutagenesis. 

Oligonucleotides of the desired sequence were synthesized, 
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phosphorylated and used as primers as described by the 
manufacturers of iMuta-gene Phagemid (Biorad Labs, Ca) or 
Altered sites II (Promega, Wi) mutagenesis kits. 
Mutations were generally identified by digestion with the 
5 appropriate restriction enzyme or by nucleotide 

sequencing, or both. Genome fragments containing 
introduced mutations were subcloned to create larger 
plasmids such as pOAV2 00 using appropriate unique 
restriction sites. 
10 Construction of a Full-Length Genomic Clone of OAV287 
The terminal BamHI A and D fragments (cloned in 
Bluescribe M13+) were each modified by mutagenesis to add 
the nucleotides lost during cloning and a Kpnl site. The 
last base of the Kpnl site incorporated the C at the 5' 
15 end of each genomic ITR sequence. This produced plasmids 
pAK and pDK {Figure 6). 

The left hand approximately 21.5kb of the genome was 
constructed from the BamHI D and B fragments and the SphI 
A fragment of approximately 13kb. The genomic BamHI B 
2 0 fragment cloned in pUC13 was modified by mutagenesis 

(GCATGC to GCATCC) to remove the SphI site at position 
8287 producing pUC13B. The modified fragment was released 
by BamHI digestion and cloned into pDK which had been cut 
with BamHI and dephosphorylated . Colonies carrying the 
25 recombinant plasmid pDBM (Figure 6) were identified by 

screening with an oligonucleotide which spanned the BamHI 
B/D junction. The SphI A fragment (approximately 13kb) 
was cloned into the SphI site of pSELECT (Promega) to form 
pSESPH. This fragment contains a Smal site near its left 
30 hand end which is common to pDBM. The KpnI/Smal fragment 
from pDBM was subcloned into pSESPH which had also been 
cut with KpnI/Smal to produce pSELLH, a plasmid based on 
pSELECT which now contained the left-hand approximately 
21.5kb of OAV287 DNA. 
35 The right-hand end of the genome was constructed 

from pAK which contains the right-hand approximately 8.6kb 
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of the genome and overlaps the SphI A fragment. pAK was 
cut with Sail and ligated with Sail-cut pACYC184, a 
plasmid of 4.24kb which contains a gene encoding 
chloramphenicol (Cm) resistance and an origin for DNA 
5 replication, to form a pACm (Figure 6). This plasmid was 
cut with SphI and Kpnl to produce the right-hand genomic 
fragment incorporating the pACYC184 sequences. This was 
ligated with the left-hand KpnI/SphI fragment -of 
approximately 21.5kb prepared from pSELLH to produce the 

10 final plasmid pOAV287Cm (Figure 6). This plasmid 

replicates stably in E. coli and therefore removes the 
need to propagate the virus to obtain genomic DNA for 
further study. The recombinant genome in plasmid 
pOAV287Cm differs from the wild-type viral genome by the" 

15 single point mutation in the SphI site (base 8287), by the 
presence of pACYC184 sequences in the Sail site and by the 
addition of a GTAC sequence between the ITRs . However, 
insertion of pACYC184 sequences in the Sail site disrupts 
two significant open reading frames whose functions, are 

20 unknown. If either of the gene products was essential for 
replication, then pOAV2 8 7Cm could not produce infectious 
virus following trans f ection . To circumvent this 
potential problem pOAV287Cm was modified further. First, 
plasmid Bluescribe M13- (Stratagene, La Jolla, Ca.) was 

25 cut with Hindlll and end-filled. The linear plasmid was 
then cut with Smal, blunt-end ligated and transformed. 
The resulting plasmid contained an ampicillin resistance 
gene and origin of replication and lacked Sail and SphI 
sites but retained a unique Kpnl site. This plasmid was 

30 cut with Kpnl and ligated with KpnI-cut pOAV287Cm. 

Plasmids which were doubly resistant to ampicillin and 
chloramphenicol were selected and grown. One of these was 
cut with Sail to release the pACYC184 sequences, religated 
and transformed. The resulting plasmid pOAVlOO contained 

35 the AmpR gene and replication Ori inserted in the Kpnl 
site between the ITR's of the genome (Figure 7). This 
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plasmid replicated stably in E. coli strain JM109 when 
maintained in the presence ampicillin ( 200(ig/ml) . Large 
quantities of plasmid were grown for transfection studies. 
Trans fection of DNA and Virus rescue 
5 To determine whether the recombinant genomic clone 

was infectious, pOAVlOO was cut with Kpnl to release the 
linear viral genome and DNA was trans fected into CSL503 
sheep foetal lung cells using lipof ectamine (GibcoBRL) . 
Solution (A) containing plasmid DNA (2-lQp.g) and 300)11 

10 EMEM (containing hepes + glutamine), but lacking foetal 

calf serum (FCS) and solution (B) containing lipof ectamine 
(IQjil) + 300li1 EMEM (containing hepes + glutamine), but 
lacking FCS were combined, mixed gently and incubated for 
4 5 minutes at room temperature. Subconfluent CSL503 cells 

15 in a 60mm petri dish were rinsed with 3ml EMEM (plus hepes 
and glutamine) lacking FCS. EMEM (plus hepes and 
glutamine) but lacking FCS (2.4ml) was added to the 
mixture of solutions A and B, mixed gently and added to 
the rinsed CSL503 cells. Cells were incubated for' 5 hours 

20 at 37°C in 5% CO2 . The incubation medium was changed 
using complete EMEM plus FCS (10%) and cells were 
incubated at 37°C in 5% CO2 until virus plaques or 
cytopathic effect was visible (7-15 days). 

To confirm that viruses rescued from transfection of 

25 pOAVlQO and pOAV200 were derived from those plasmids a 
portion of the genome of wild- type OAV2 87, OAV100 and 
OAV2Q0 viruses was amplified by FCR. For OAV100 a primer 
pair spanning the region of the mutated SphI site at bases 
8287-8292 was used. For OAV200 the primer pair spanned 

30 the insertion site for the Apal/NotI polylinker between 
the pVIII and fiber genes. Wild-type OAV287 DNA was 
amplified as a control in each case. DNA amplified from 
wild-type OAV2 87 was cut with SphI whereas the DNA 
amplified from OAV1Q0 was not (Figure 8A} . Similarly 

35 OAV200 DNA was cut with Apal, EcoRV and NotI whereas 
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OAV2 87 DNA was not (Figure 8B) . Other viruses were 
similarly characterised by restriction enzyme digestion. 
Identification of MLP/TLS elements and Construction of pMT 
OAV2 87 TLS elements were identified as follows and 
5 as described (17). mRNAs present in 0AV2 8 7 -infected 
CSLS03 cells were copied into cDNA by reverse 
transcription using primers complementary to the Ilia or 
fiber genes . A primer thought to fall within TLS exon 1 
was then paired with each cDNA primer for PCR. DNA was 

10 successfully amplified, cloned and sequenced. This 

identified TLS exons 2 and 3 (which correspond to bases 
8083-8145 and 8350-8412 of Figure 1, respectively) and the 
3' boundary of TLS exon 1 which occurs at base 5044 of 
Figure !♦ A second PCR strategy was then used to obtain _ 

15 MLP and TLS fragments suitable for assembly into pMT. The 
region in Figure 1 between nucleotides 4861 and 5023, 
thought to contain the MLP was amplified by PCR using a 
plus sense primer which added an Apal sequence at the 5' 
end and a 3' minus sense primer which introduced an Ndel 

20 site by point mutation at base 5012. Similarly, the TLS 
was amplified using a plus sense primer which introduced 
the Ndel site at base 5012 and a minus sense primer which 
was complementary to bases 8396-8412 and which added a 
Hindi I I site at the 3' end of the PCR product. The PCR 

25 fragments were digested with Apal/Ndel and Ndel/Hindlll , 

respectively and the fragments were cloned into Bluescript 
SK+ (Stratagene) cut with Apal/Hindlll . The resulting 
plasmid was then digested with Hindlll/NotI and a 
synthetic oligonucleotide with Hindlll/NotI termini and 

30 the sequence shown in Figure 9 was cloned to produce 
plasmid pMT . Genes of interest were then cloned into 
convenient restriction sites in the NCS . Gene expression 
cassettes were subcloned as Apal/NotI fragments into 
pOAV200 or rescued into infectious virus. 
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Infection of cells and expression of antigens 

CSLS03 and other cells were infected with viruses at 
a multiplicity of infection of 20pfu/cell as described 
previously (21). Infection was allowed to proceed for 24- 
5 60 hr. Cells were then incubated in methionine-f ree 

medium in the presence of 35 S-methionine to label newly 
synthesized proteins. The protein of interest was 
recovered from cell lysates by immunoprecipitation using a 
specific antiserum against the expressed protein (21). 
10 Recovered proteins were analysed by polyacryl amide gel 

electrophoresis and detected by autoradiography or using a 
phophorimager (Molecular Dynamics), 

To characterise the genome in molecular terms, BamHI 

15 restriction fragments representing the entire OAV287 

genome were cloned into various plasmids and sequenced 
using methods described in Sambrook (9) and similar 
publications . Sequences were determined on both strands 
by using nested sets of deletion mutants together with 

20 synthetic oligonucleotide primers which were synthesized 
from newly determined sequences* 

The viral sequence of 29,544 nucleotides (Figure 1) 
is considerably shorter (by approximately 6.5kb) than the 
sequence for human adenoviruses but many genes encoding 

25 structural proteins are identified by their homology with 
their Ad2 homologues (Figure 2). It is clear, however, 
that the ovine adenovirus genome shows major structural 
and sequence variations compared with all other 
adenoviruses studied to date (Figure 2), in the regions 

30 encoding both structural and non-structural proteins. In 
particular, 

(a) the reading frames tentatively identified as 
forming the E1A/B regions are named principally on the 
basis of their location in the genome. Very limited 
35 homology can be detected between the 44.5kD open reading 
frame (orf) and the large T E1B protein of other 
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adenoviruses. Homology in the putative E1A region of 
OAV287 has not so far been detected; 

(b) in other adenoviruses ^the E4 region is normally 
located at the right-hand end of the genome. The OAV287 
5 E4? region is tentatively identified based only on the 

presence of a protein sequence motif HCHC . . PGSLQC which is 
found in 18.8kD and 30.85kD orfs in this region. 
Identical or very similar motifs are found in the E4 34kD 
protein of human Ad2 and Ad4 0 and mouse adenoviruses; 

!0 (c) the distance between the end of pVIIl and the 

beginning of fiber, which in other viruses defines the E3 
region, is only 197 nucleotides (Figure 4). The E3 region 
equivalent, if it exists in ovine adenovirus, may consis.t 
of the cluster of open reading frames which are present in 

15 the right to left orientation on the complementary DNA 

strand, at the right-hand end of the genome (Figures 2 and 
5). However, these sequences show no detectable homology 
with any other adenovirus and the functions of these 
proteins cannot be deduced from such comparisons; 

20 (d) there is a region of approximately Ikb which 

lies between E3? and E4? which has a very high A/T content 
(70.2%) (Figure 1). As there are no open reading frames 
encoding greater than approximately 30 amino acids in 
length on either DNA strand it is unlikely that the region 

25 codes for any proteins, unless mRNAs are generated by very 
complex splicing events. This region has no known 
equivalent in any other adenovirus ; 

(e) other differences are apparent in the structural 
proteins of the virus. OAV287 lacks homologues of Ad2 

30 proteins V and IX. However, OAV287 has a completely new 
gene coding for p2 8kD which is located on the 
complementary strand of the E1A? region (Figure 2 and 3). 
This is a structural protein with an apparent size of 28kD 
by SDS PAGE which, according to N-terminal sequencing 

35 data, is cleaved from a larger precursor. No homology 
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between this protein and others in the databases has been 
detected; 

(f) in most other genomes the VA RNA genes are 
located between the Terminal protein and the 52/55k genes. 
5 In OAV2 87 there is no room for them as the reading frames 
overlap . 

These differences serve to emphasize the unique 
character of the OAV287 isolate compared with other human 
and animal adenoviruses. In addition, since the OAV287 

10 non-structural regions show little or no homology with 
equivalent regions in other adenoviruses, sequence 
comparisons do not reveal the identity of likely non- 
essential regions of the genome. Moreover the viral DNA 
cannot easily be manipulated to test for dispensable 

15 sequences . 

The present inventors have produced a plasmid 
containing a full length infectious copy of an ovine 
adenovirus genome in which the ITR sequences are linked by 
a short sequence which creates a unique restriction enzyme 

20 site. A plasmid containing a full length infectious copy 
of an ovine adenovirus genome linked to a bacterial origin 
for DNA replication and a marker gene has been produced. 
Partial clones of OAV28 7 genomic DNA were specifically 
modified and initially linked to a gene encoding 

25 antibiotic resistance and origin of replication inserted 
into the unique Sail site of the genome (Figure 6 and see 
Methods) . Such a plasmid can be grown in bacteria and 
more easily manipulated. 

The circular genomic clone differs from the 

30 naturally occurring circles that occur in Ad5-infected 

cells (10) and that might exist in OAV2 8 7 -infected cells 
in that the 4 0 base pair ITRs are joined by a GTAC linker. 
Together with the last and first nucleotides of the genome 
(G and C, respectively, see Figure 1), this sequence forms 

3 5 a unique Kpnl site (GGTACC) when the ITRs are joined head 
to tail. Other sites such as EcoRI, BamHI, Sail, KasI etc 
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which have recognition sequences beginning with G and 
ending with C are suitable if they are unique as the 3 ' 
and 5' terminal nucleotides of other adenovirus genomes 
are G and C, respectively. A plasmid with a suitable 
5 antibiotic resistance gene e.g. ampR and origin of 
replication can be inserted at the unique site or 
elsewhere in the genome to form a plasmid which can be 
propagated in bacteria. Plasmids propagated in the 
presence of 200 p-g/ml ampicillin in E.cali strains JM109 

10 and DH5-alpha retain the Kpnl sites and inserted 

sequences, indicating that the OAV287 ITR sequences are 
stable when linked in this manner. This approach may 
therefore be used to engineer other adenovirus genomes . 
If desired the GTAC linker sequence can be removed and the 

15 authentic termini regenerated prior to trans fection by 

digestion with Kpnl (or another appropriate enzyme) and ~~ 
incubation with T4 DNA polymerase to create blunt ends 

(9)- 

A method for generating linear infectious genomes 
20 from circular plasmids involved digesting the circular 
plasmid containing the full length copy of the OAV287 
genome with restriction enzyme Kpnl to generate a genome 
with the authentic 5' nucleotide dCMP. The linear DNA is 
then introduced into CSL503 cells using lipof ectamine as 
25 the trans fee ting reagent. 

To develop a viral genome as a vector it is 
essential to identify region(s) of the genome which are 
non-essential for function. These regions can be then 
substituted or deleted to make room for foreign DNA (11/ 
30 12), or they may be the site for insertion of foreign DNA. 

In the human adenovirus genome DNA has been substituted or 
inserted into the El and E3 regions (13, 14, 15) and at 
the extreme right-hand end of the genome between E4 and 
ITR, usually with the concomitant deletion of non- 
35 essential regions to facilitate packaging of the genome 

(16). Adenoviruses will package genomes up to ~6% larger 
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than the wild-type, probably due to physical constraints 
dictated by the capsid structure {11). 

Non-essential sites in the OAV287 genome were 
identified by insertion of a polylinker sequence 
5 containing Apal and NotI restriction sites. This linker 
was introduced into the genome copy in pOAVlOO between 
nucleotides 22,139 and 22,130 of Figure 1 by site directed 
mutagenesis to create plasmid pOAV200 (Figure 7). This 
corresponds to a site located in the intergenic region 

10 between genes for the pVIII and fiber proteins which was 
chosen because it avoids disruption of RNA processing 
signals in the region. A transcription termination site 
for the L4 family of RNAs maps 26 nucleotides upstream and 
the splice junction between 

15 the tripartite leader sequences and fiber mRNA maps 144 

nucleotides downstream of the insertion site, respectively 
(17). Transfection of pOAV200 into CSL503 cells resulted 
in the rescue of virus OAV200. The second site at which 
the polylinker was inserted was located between bases 

20 26,645 and 26,646 of Figure 1. This created plasmid 

pOAV600 (Figure 7). This insertion site corresponds to 
the right hand end of the A/T-rich region (Figure 2) whose 
function and precise boundaries are unknown. The site was 
chosen as it is six nucleotides to the left of the 

25 transcription termination point for RNAs transcribed from 
right to left from the E3? region (Figure 2). This was 
determined by sequencing cloned RT-FCR-amplif ied cDNAs 
derived from the region using methods similar to those 
described for the pVIII/fiber region (17). Transfection 

30 of pOAV600 into CSL503 cells yielded virus OAV500. 

The above insertion strategy identified two regions 
of the genome which can be interrupted and created sites 
for subcloning gene expression cassettes. 

A further non-essential site was identified using 

35 the unique Sail site located at bases 23644-28649 of 
Figure 1. The site was cut with Sail, end-filled and 
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religated to disrupt the reading frames which spanned the 
site, A plasmid pOAV600S (Figure 7), which had lost the 
site was identified by digestion with Sail. When pOAV600S 
was transfected into CSL503 cells f virus OAV600S was 
5 recovered. The loss of the Sail site in this virus was 

confirmed by digesting the viral genome with Sail. As the 
Sail site falls within two significant open reading frames 
(which extend on the complementary strand between bases 
28457 and 29014 and between 28511 and 28699), which were 

10 disrupted by end-filling and religation, the gene products 
derived from the reading frames are probably also 
dispensable. This group of reading frames may therefore 
constitute the E3 region of OAV2 87 as no other gene 
products in any adenovirus are dispensable for 

15 replication, in vitro. This implies that it should be 
possible to delete the whole region labelled as E3? in 
Figure 2. In addition, in other experiments a Ikb Ndel 
fragment was deleted from the region marked as E4? in 
Figure 2 . This deletion disrupted several reading frames 

2 0 in the region. No virus has been rescued from a such a 
plasmid, suggesting that it is not dispensable and 
accordingly, it may be E4 . 

Many viruses replicate incompletely in heterologous 
hosts, often entering cells but being unable to produce 

25 mature virus particles because of a block in the 

replication cycle. In the context of recombinant viral 
vectors, this represents a desirable safety feature, 
provided that replication is not blocked before 
appropriate and effective expression of the foreign gene 

30 occurs. OAV287 does not replicate productively in 

heterologous cell types (18), the only exception so far 
being bovine nasal turbinate cells in which viral titres 
are significantly reduced compared with the CSL503 cells. 
Recombinant forms of OAV2 8 7 have been constructed to 

35 determine whether expression of a reporter gene under the 
control of an appropriate promoter occurs. 
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Foreign gens expression requires that the gene be 
functionally linked to a promoter. This may be a viral 
promoter inherent in the genome, or a foreign promoter 
subcloned together with the gene of interest into a 
5 suitable site. The promoter driving gene expression must 
function in CSLS0 3 and preferably a range of other cell 
types. In this work an OAV2 8 7 genomic promoter was used 
initially. Subsequently an heterologous promoter was also 
used. In adenoviruses, expression of the structural 

10 proteins is driven by the major late promoter (MLP) . 

Families of RNA transcripts derived from the MLP contain a 
common sequence element, the tripartite leader sequence 
(TLS) at their 5' ends. The present inventors have 
identified those nucleotides in the OAV287 genome which 

15 comprise the TLS by using RT-FCR amplification of late 
mRNA transcripts present in OAV287-infected cells and 
sequencing of cloned cDNAs (17). A candidate MLP was 
expected to be present just to the left of TLS exon 1 
(Figure 2). The ML? and TLS elements were subcloned using 

20 PCR techniques into a separate plasmid pMT (Figure 9) and 
linked with genes of interest. These promoter/gene 
cassettes were subcloned as Apal/NotI fragments into the 
polylinker Apal/NotI sites of pOAV200. Using this 
strategy plasmids pOAV203, pOAV204, pOAV205 and pOAV210 

25 were constructed. These incorporate genes encoding a 17kD 
soluble protein from T. colubriformls , a rotavirus VP7sc 
gene (19), the 45W antigen from Taenia ovis (20) and a 
membrane protein (PM95) from Lucilia cuprina, 
respectively. Plasmid pOAV202, contained the 17kD antigen 

30 but lacked the MLP/TLS elements. These plasmids were 
transfected into CSL503 cells and rescued as viruses 
OAV202, OAV203, OAV204, OAV205 and OAV210, respectively 
(Figure 10 ) . 

The human cytomegalovirus immediate early IE94 
35 promoter plus enhancer, which functions in a range of 

human and animal cell types (21), was also linked to the 
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rotavirus VP7sc antigen gene. This cassette was assembled 
by replacing the MLP/TLS elements in pMT/VP7sc with the 
HCMV enhancer-promoter region. The cassette was inserted 
in pOAV200 to create pOAV206. pOAV206 was transfected 
5 into CSL503 cells and virus OAV2 06 was rescued (Figure 
10) . 

CSL50 3 and other cells were infected with the 
viruses described above and at various times post- 
infection the cells were radiolabelled with 35 S- 
10 methionine. Proteins of interest were recovered from cell 
lysates by immunoprecipitation using an appropriate 
antiserum. Recovered proteins were analysed by 
polyacrylamide gel electrophoresis and detected by 
autoradiography . 

15 When virus OAV202 was used, no expression of the T. 

coulbriformls 17kD antigen was observed by 
immunofluorescence. As this virus lacks the MLP/TLS 
elements and carries only the 17kD gene this result 
demonstrates that there is no viral promoter upstream or 

20 adjacent to the insertion point between the pVTII and 

fiber genes which is capable of driving gene expression. 
As the E3 region is also missing from this site there is 
no requirement for a nearby promoter. This situation 
contrasts with observations made using a human Ad5 E3 

25 recombinant (21). In this case a promoterless gene 

inserted 3' proximal to the pVIII gene was expressed, 
probably from the adjacent E3 promoter or the upstream MLP 
(15, 21). This result further emphasizes the unique 
nature of the OAV2 8 7 genome. Recombinant OAV2 87 viruses 

30 carrying the MLP/TLS elements were tested for expression 
in CSL503 cells. With OAV2Q4, expression was easily 
detected in infected, but not in uninfected cells at 24hr 
post-infection (Figure 11A) . Similarly, when viruses 
OAV205, and OAV210 were tested, gene products of 24kD and 

35 approximately 95kD, respectively were detected (Figure 

11B). Therefore it is clear that MLP/TLS elements contain 
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the necessary information to drive gene expression in the 
homologous cell line under replication-permissive 
conditions- However, when OAV204 was tested in a 
heterologous rabbit kidney cell line in which the virus 
5 does not replicate productively, no Wise expression was 
observed. Some replication occurs in bovine nasal 
turbinate cells, although to a lower titre than in CSL503 
cells. In the latter cells, expression of VP7sc was 
detected following infection with OAV204 (Figure 11B) . 

10 Virus OAV206 containing the HCMV enhancer/promoter 

element linked to the VP7sc gene was used to examine the 
function of a heterologous promoter in the context of the 
OAV287 genome. CSL503 cells infected with this virus 
readily expressed Wise antigen at 24-48hr post infection 

15 (Figure 12A) * With this virus Wise expression was also 

observed in the non-permissive rabbit kidney cell line and 
in bovine nasal turbinate cells (Figure 12B) . These 
results suggest that the HCMV or a similar constitutive 
promoter may be preferred over the MLP to drive gene 

20 expression in OAV recombinants in non-permissive cells. 

One recombinant virus was also administered to 
sheep. Five sheep were vaccinated intracon junctivally and 
intranasally with 0.7xl0 8 pfu of OAV203. At three days 
post-inoculation virus was recovered from the nasal swab 

2 5 of one sheep and from the conjunctival swabs of two sheep 
and confirmed as the recombinant virus by PGR analysis. 
Animals showed no obvious ill effects from such 
vaccination . 

The viral vectors of the present invention can be 
30 used for the delivery and expression of therapeutic genes 
in grazing animals . In species which are not normally 
infected by ovine adenoviruses the lack of pre-existing 
immunity should allow efficient infection, gene delivery 
and expression. The genes may encode vaccine antigens, 
35 molecules which promote growth in production animals, 

molecules which modify production traits by manipulating 
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hormone responses and other biologically active or 
therapeutic molecules. The virus does not replicate 
productively in many non-ovine cells but the use of 
heterologous promoters allows the delivery and expression 
5 of genes while minimising the possibility of virus spread 
to a non-target host- As the DNA of adenovirus vectors 
can persist in cells in an unintegrated form, with the 
appropriate choice of promoter, expression over a 
prolonged period can be achieved. 
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CLAIMS : 

1. An isolated DNA molecule comprising a nucleotide 
sequence encoding the genome of ovine adenovirus (OAV287) 
substantially as shown in Figure 1 or a functionally 

5 equivalent nucleic acid sequence. 

2. The DNA molecule as claimed in claim 1 such that the 
nucleic acid sequence encoding the genome of the ovine 
adenovirus is substantially as shown in Figure 1. 

3. An isolated DNA molecule comprising a nucleic acid 
10 sequence encoding the genome of ovine adenovirus (OAV287) 

substantially as shown in Figure 1 wherein a portion of 
the adenoviral genome not essential for the maintenance or 
viability of the native adenovirus is deleted or altered. 

4. An isolated DNA molecule comprising at least a 15 
15 nucleic acid base sequence being substantially unique to 

the ovine adenovirus (OAV287) nucleic acid sequence as 
shown in Figure 1. 

5. The DNA molecule as claimed in claim 4 such that the 
at least 15 nucleic acid base sequence encodes a 

20 functional element of ovine adenovirus (OAV287). 

6. The DNA molecule as claimed in claim 5 such that the 
functional element is selected from the group consisting 
of promoter, gene, inverted terminal repeat, viral 
packaging signal and RNA processing signal. 

25 7. The DNA molecule as claimed in claim 6 such that the 
functional element is the inverted terminal repeat having 
the nucleic acid base sequence 1 to 46 as shown in Figure 
1. 

8. A plasmid including the DNA molecule as claimed in 
30 any one of claims 1 to 7 . 

9. A plasmid including the DNA molecule as claimed in 
any one of claims 1 to 3 such that the nucleic acid 
sequence encoding the adenovirus genome or a portion 
thereof is linked to a nucleic acid sequence encoding an 

35 origin of replication and a further nucleic acid sequence 
encoding a marker. 
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10. The plasmid as claimed in claim 9 such that nucleic 
acid sequences ' encoding inverted terminal repeats of the 
adenovirus are j o ined . 

11. The plasmid as claimed in claim 9 or 10 such that 

5 the nucleic acid sequence encoding the marker encodes for 
resistance to an antimicrobial agent. 

12. A viral vector comprising a DNA molecule including a 
nucleic acid sequence encoding the genome of ovine 
adenovirus (OAV287) substantially as shown in Figure 1 or 

10 a functionally equivalent nucleic acid sequence or a 

portion thereof and at least one nucleic acid sequence 
encoding a non-adenoviral polypeptide or polypeptides. 

13. The viral vector as claimed in claim 12 such that 
the nucleic acid sequence encoding the genome of the 

15 adenovirus is substantially as shown in Figure 1. 

14. A viral vector comprising a DNA molecule including a- 
nucleic acid sequence encoding the genome of ovine 
adenovirus (OAV287) substantially as shown in Figure 1 
wherein a portion of the adenoviral genome not essential 

20 for the maintenance or viability of the native adenovirus 
is deleted or altered, and at least one nucleic acid 
sequence encoding a non-adenoviral polypeptide or 
polypeptides . 

15. The viral vector as claimed in any one of claims 12 
25 to 14 such that the nucleic acid sequence encoding the 

polypeptide or polypeptides encodes a polypeptide or 
polypeptides derived from bacteria, viruses, parasites or 
eukaryotes . 

16- The viral vector as claimed in claim 15 such that 
30 non-adenoviral polypeptide is rotavirus VP7sc antigen, the 
parasite polypeptide is Trlchostrongylus colubriformis 
17kD antigen, the Taenia ovls 45W antigen cr the PM95 
antigen from Lucllla cuprina . 

17. A method of delivering a DNA molecule having a 
35 nucleic acid sequence encoding a non-adenoviral 

polypeptide or polypeptides to a target cell, the method 
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Comprising infecting the target cell with a viral vector 
as claimed in any one of claims 12 to 16 such that the DNA 
molecule encoding the polypeptide or polypeptides is 
expressed and the polypeptide or polypeptides is produced 
5 by the target cell . 

18. A method for delivering a DNA molecule having a 
nucleic acid sequence encoding a non-adenoviral 
polypeptide or polypeptides to an animal, the method 
comprising administering to the animal a viral vector as 

10 claimed in any one of claims 12 to 16, such that the viral 
vector infects at least one cell of the animal and the 
infected cell expresses the DNA molecule encoding the 
polypeptide or polypeptides and produces the polypeptide 
or polypeptides . 

15 19. The method as claimed in claim 18 such that the 
animal is a grazing animal. 

20. The method as claimed in claim 19 such that the 
grazing animal is a sheep* 

21. A viral vector comprising a DNA molecule including a 
20 nucleic acid sequence encoding the genome of ovine 

adenovirus (OAV287) substantially as shown in Figure 1 or 
a functionally equivalent nucleic acid sequence or a 
portion thereof and at least one nucleic acid sequence 
encoding a functional RNA molecule* 
25 22. The viral vector as claimed in claim 21 such that 
the functional RNA molecule is an antisense RNA molecule 
or ribozyme. 

23. A method for delivering a DNA molecule having a 
nucleic acid sequence encoding a functional RNA molecule 

30 to an animal, the method comprising administering to the 
animal a viral vector as claimed in claim 21 or 22, such 
that the viral vector infects at least one cell of the 
animal and the infected cell expresses the DNA molecule 
encoding the functional RNA molecule and produces the RNA 

35 molecule. 



Abstract of the Disclosure 



A genome of an ovine adenovirus designated OAV28 7 is isolated 
from sheep and sequenced. Portions of the genome not essential for 
maintenance or viability of the virus can be deleted or altered. 
A nucleotide sequence encoding a non- adenoviral polypeptide can be 
incorporated into the genome. The a full-length clone of the genome 
can be provided as part of a plasmid or viral vector. Cells can be 
transformed with a vector of the invention such that they express 
an exogenous protein. 
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Fig 1 



CTATTCATAT 

CATGGAATTT 

TTTTTTACTT 

TACATACAAG 

ACGCATAAAT 

TCCTGCTGAT 

TTGTTCATCT 

GCTGTTTCCA 

TCCTTAAAAA 

AGATATAATT 

TAGGATCAGG 

TGGAGAGGAC 

ATCTGTATCA 

TTTGAAGACT 

AGGAATATTG 

GGATGAAATG 

TCTCTGAACA 

TGGCTTTGGT 

GTTATTTGTG 

CAAATGGAAA 

GAATCAGATA 

GATCAAGTGT 

GCGGTTTGTT 

CATGTTATGT 

TCTGTGATTT 

AATGACTCCT 

ATGGTATCAG 

TTTTTTAAAA 

CCCTGGTAAT 

CGTGTCTATG 

AGCTGAAACT 

AATAAAACCC 

TTCGGCGGAG 

TTCGGCATCT 

TCAAGATCCA 

TTATTACTGG 

AGTACAACTT 

AGATTTTTAC 

TGGCCAGTTA 

AACTTCAATA 

TAATAGAAAT 

TTTTAATGGT 

TCAAAATCAA 

AAATAATAAT 

GT AT G AAGG C 

TAACCATGCT 

AACGATACAG 

CGG7AATTTT 

TAAATGGTGC 

TCAAGTTCAG 

TAATGTAACC 

GTAAAAAACT 

CTGCGTATAA 



ATATAACGTT 

ACAAAGAAGT 

ATTACATTTT 

CCAAAAT7CG 

GGACGTACAG 

GCCGCTGCAG 

GCTGCTTTTA 

AAAGCTTGCA 

TAGGCCAACC 

AAGCGGAGCA 

CCAAGAAGTG 

TGTTAAAATT 

TGTTCTCCAT 

TCTGTTTCTT 

ACGGTAATGT 

GGTTTTGTGG 

TAAGTATTTT 

CTTTGAAATT 

ACGTACATCC 

AGAAATTGCT 

GTGCAGGATT 

CTTGGATATG 

TTTGTTTGTG 

AATGAAAATG 

TTTTGCCTTA 

GCATTTACAG 

CAGATATTTA 

AAAATGGCCT 

GTTTGTAACA 

TTTTGTGTCT 

CGCCAGAATT 

TAATTTTTAG 

TTTTCTGTTG 

TCTAATAATT 

ATAGCCTTCT 

AAGTGTATCG 

GTCGGACCTG 

GTCGTGTTTG 

AGTTTAGGAC 

GTCAATTGTA 

TTTTGGAATG 

TGTAGAATTG 

TTTTATGATT 

GTTATTGTTA 

CATTCCGAAA 

GATAACGGAG 

T TAG CATC AT 

ATTACTGGAT 
GTTGCTGAAG 
ATGAAAAATA 
TTTTATTCAA 
GTTCTTTTTC 



GCACAGAGG^ 
AAGTTGTTGG 
TCATCTTTTT 
CATAAAATGT 
CAGCAATTGG 
AAAGGATAGA 
TTATATCTTC 
TCATCGGATT 
CATCTAAAGC 
ACCGAGAGGT 
AACCAAAAAG 
GCAAAACGGT 
CAGAAGGTCT 
GAAATTCTGT 
TATTCACATC 
GTTCTTTCAA 
CTGATTTTGG 
TTTTCTTCCT 
TGTTAGCTAC 
GAAACCTTCT 
TTTTCTTTTT 
TTTAAGAGAT 
CAAATCTAAA 
ACGTCGGGGA 
TTAGGAAATA 
AAAGGAATTT 
ACCCAATATG 
TTATTTATGC 
AACTTGATAT 
TAGTGTGTTG 
GTCACGCGGT 
TTTGTAAAAA 
AATTTCCTTA 
CATCGAGTCA 
TTCAAACTAA 
AACTGTCAAA 
GACCTGTGTT 
AAAATATCAA 
TTACAACTCA 
AC7TTAAAAA 
CGAGAAAATG 
GAATTTCTAA 
GTCAAATCTG 
ACTGTAGATG 
ATAATAATCC 
GCAATGTCTG 

fj,rr> rjf ti m t* "j* s+ \ 
H.iA*i:i \ff A 

GAG AT GT AAA 
GTAAT TTCTA 
CTGTAAAAGA 
TTG TAG AAGG 
AACAAAATGG 
TAAACACTCT 



TGTG 
ATCTTTATTC 
TACTTCACAT 
CTTACTTTAA 
AATAGCAGGA 
TGCTATCGTA 
TGCCAATCTA 
TTCAATTAAA 
AGTTAAAAGT 
TAAATTCCAG 
ACTTGTAAGT 
ATCTAATGAC 
TATTGGGAAG 
TTTCGGTAAG 
TACAATTTCT 
TATATAATTG 
CGGTTTTTTG 
TTTTCTGTAG 
ACGATTTTCC 
ATTAATCATA 
GATACTGATA 
ATAACTCTTC 
TTTGATGTAC 
TTGAATGGAT 
AATTTGTGGC 
GTACTGTGTT 
GAT T AAGG C A 
TAGCGACTTG 
CATCAAGAAA 
GCTTGCTTCT 
AAGCAAATTT 
TAGAATTCAA 
TGTTTCTAAG 
GAATATTGAC 
CAATACGGCT 
GCCTATTCAC 
TGTTTTCAAC 
CTTTATTGAA 
CAGTGCTGTA 
TTTTAGGGGA 
GAATCAGCAG 
TACTGGTTCA 
TTTTAATGTA 
TGCTTATCTG 
CGCTAAGGG - 
GCCTACTCAG 
TGATAATCAA 
CATTGTAAAT 
TGGTAATACA 
CAAAGTG . . * 
TAACATGACT 
ATTTACATTT 
TCTAATTTCC 



GGTTTTTTAT 
ACAATTCTTT 
GATATTTTAC 
AAAGTTAAAT 
AGGGCCATTG 
CGCATAAACC 
GGTGATATTT 
TGGATTGGAT 
ATTCTCCCTC 
GGTCCTCCGA 
AGAAGTTGTC 
CATTTCTTCT 
TACCATTGGT 
CGACTAGCAG 
GGAGGAATCC 
CGAGGAGGGT 
CTTTTTCGCG 
GCTCCTCCTG 
CGGACTGCAA 
TAAATTGTCA 
ATTTATACTA 
ATTGTGATCG 
ACAATATTCT 
TGAGCCTTAT 
GCCAGTACGA 
TTGCTTGACT 
AATTTATGGG 
GCGTTGTTAA 
GATCTTCCTG 
TTCTGTAAAG 
CTGGCACAAC 
ATTTTTAACG 
CCAATTGTTC 
TTTCCTGTTC 
TACTTACAAC 
ATTTACGGTC 
AGTGAAAGTG 
GATGAATTTC 
TGGTTTATCA 
GCGGCTCTTT 
CATTTAGTTT 
TCTGAATATT 
ACCGGGGGTA 

; wA lull vjtvjA^ 

ACTTTCTGCA 
TTTAAACTTA 
GAAATTCCAC 
TTTTCTACCA 
CATGCAGCTA 
ATTATTGGGT 
CCAAAAATTG 
AAACGTTTTA 
ATACATGCTT 



TGTTTATTGT 

TAACAATGAC 

TTAAATTTTG 

TTTTTTTTTA 

TAAAGTGTGT 

CCCCTCCTAT 

GCTTTTGAAT 

TTGCAGAATT 

CAGGAACCAC 

AG AG AG TAT C 

TGATATGCTT 

TTACTTTTAC 

CACGAGCATC 

TTATGGTATT 

ATCTTGCATA 

TTTTCCAAAA 

CTCTTTTTCT 

CTAAAGCTGT 

ATTTTTTTGC 

GTGGAATCAT 

TTATGTATTG 

CATG.TGGTTA 

AGCGGGAGTA 

TTGACATTTT 

TGGAGATTGG 

TTAATTTAAG 

CTTTCTCTGA 

ATTCTTACAT 

AAGATTTTAC 

GTTCTAATTT 

TATCAAAATT 

CCACAATGAC 

CATGGCCTGC 

TTAAACCAGA 

CTGGAGCTAC 

AAGGAGCTAC 

TTATTCCTGA 

CTATTAGAAG 

ATGTATGGAA 

GGTATTCAGA 

CAAATTGTCG 

CCATAGCCAG 

ATTGGTCTAG 

ATAACATGTG 

ATAACATAAT 

CAGATGGATC 

CTTGTTATAG 

CAAAAATTGA 

AC GATGCTG-j 

GTTCTGGTAA 

GTACAATAAA 

CATATTGATT 

GATAAAACAA 



60 
120 
130 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1300 
1860 
1920 
1980 
2040 
2100 
2160' 
2220 
2280 
2340 
2400 
2460 
2520 
2530 
2640 
2700 
2760 
2820 
2330 
2940 
3000 
3060 
3120 
3130 



x WO 96/03508 



PCT/AU95/00453 



2/23 



Fig 1 (cont) 



ACTTTGTAAA TTCATAAATA TAGGTTTGAC 
AAATGATTCG GTAATAGGAA CATTATTATA 
TGCATGATCC ACTA TAT CTT TAAGTACAGG 
QTTTTTAATA ^rcTATTTA TCTGTGAAGA 
TGATTGAATT TTTAAATCCT TAATATTTCC 
AACTACTACA ACAGTGTAAC CATTACATTT 
AGCATGAAAG AAAGAACTTA TAGAATGACA 
TATAATACAG ATAGATCCTT CACTTGCAGC 
ATTTAGATTA GTATCGGAAA TAGCATCTTT 
TGTTTTTGTT AGTGGATTAG AGAATGCATC 
CGCAATTTTT TCTTCTAATG GAACAGTACC 
TGGTATTGGA TCAATTAGTT TTCCAGATAT 
ACCTGTGGGT CCATATACAG TAACAATGAA 
ACAGCCATCT TTTAACAGAT TGTGAGCCTC 
ATTGTGTAAA TCAGTCATAA GTTGACCATG 
TTCTGGAAAT GGATTTCTGC AAATAGAAGG 
TAATGTGTCA CTTAAAAATT TTCCCAAAAA 
GGATTTGGGT GTCTCTTGTC GTACGGGTAA 
TTCCTCATCG TTTGATCCTT CCAAGGTCTC 
GTGAATGGTA CATCGGTTCC ACTTGCGGTT 
GTCTGAAACT CTTTCTGTGG TTGTTCTAAT 
ATTTCATAGT TTAAACAATT TTTAGCATGA 
AATTTACAGT TTTTACAAGT TATGTCTTTT 
GTTTCTGAAC TGAATGCTTC AGCTCCGCAA 
CAAGTTAGAC ATGGATGTTT TTCATCAAAG 
TGTAATCCTT TTGATAACAT GAGTTGGTGG 
TCACCATAAA TACTTTTTAT CTCCCTTTCT 
AAAATTTCTG CCCACTCACT CATGAAAGCT 
TGAGTTGGAT ATCGGTTGTT CTTGATCCAT 
AAATCATTAC AATCAGCAGA TAAAAAAGTT 
CCTATAAAAA GTGGAAAATT AAAATTTTCA 
TCAGGTAGGT TTGAAAAATA CTGATTCCAC 
ATCACAGTTG TGTATGATGT AATTTCAGCT 
TCTTCAATAT TTTCAGCAAA CACTACTTTC 
TATAAAGCAT TTGATAACAA TT TACT TATA 
GCTTTTTCTT TAGCCATAAT ATTTACTTTC 
CTCCATACAG CATACATTTC AGAGCTTTTG 
AAGGTGATTA AATCGATAGA GGTCAGTACT 
AACTTTCCAC TTTTTTTAGA ACATAATGGA 
GGTTCACAAT CGGCTACCAC AATCATAGGT 
TCTTTTCTTT GTAGTAGTTC TTGAAAGTAA 
AAAGTTTTTC CATATGGAAG TGGATGCGTT 
TACACATATA TTGCTTCTTC AAATATTCCT 
AAACTCATTC TAACAAAATC ATACATTTTT 
TCAGAGGGAT GATCTTCTTC ATTATAAAAG 
CTAATTGTAG GACGTTGGAA TATATTAAAA 
AACTCTTGAT AACCTTCTAT AAGTTTTTCA 
ATACAATACT CCTTAGCTTC CTCTAATAAG 
TGTAAATATT CTTCAAATGA ATTCCAATAT 
TCATATTCTC CCAACATAAA AAAATCATTG 
ACACTCAACT GATATGCAGT AGCAGCGTCT 
TCCCTAACCA TAAATTTTAT ACCTTGCCAT 
TTCCATCTTT CATAAGTTGT ATGTGAAGGT 



TTGATCAGAA GGTGAATAAT AGCTCCATCT 3240 

TATTAACCAG CTATATTTTG AGTTAACTCT 3 3 00 

GATAAGTGCA CTCGGAAATC CAAAAGAATA 3 3 60 

ATCAAGCTGC GGACTAATAA CATGACATTT 3 420 

TCTATCATGA CGCGGGTTCA TATTATGTAA 3480 

GGCAAATCTA TTAAAAATTT TTGACGGTAA 3540 

TGATCCCAAT TGATTCATAC ATTCATCTAT 3600 

TCTGCAGAAT ATATTATCTG GATTATCAAT 3660 

GAAAGCTAAT TGTATAAATT TTGGATTTAA 3 720 

GTAGTTTCCT TCAACACACT GTGCTTTCCA 3 730 

TTTTTCTGGA GTTATGAAAA AAATTGTTTC 38 40 

AATATTTCTT ATAAATTGAG ATTTTCCGCT 3 900 

TGGTTGTAAT CCGCAGTTTA AACTGGGTAT 3960 

ATTTACAGTT TTTTGATAAT TTACAGCAAT 4020 

ATACATACAT TTATCAAAAA CTTCTTGACT 4080 

ATCTATCTTT ACAACATCAT TTTTCCAATT 4140 

GGATTTTCTG TCAATGGTTC TTGCGGTCTT 4200 

AGTAAGTATC CTTTCTTCCA CTGGATCCCT 4260 

AGAATTCTGG TTAGTTGCTT CTCTACCACC 4320 

TGCAGTGTCT TTTTTAAACT TTTCCTCGAT 43 80 

AAATTATAGT CAGTAAAACA ATGTTTTAGA 4440 

CCTTTGGCTC TTAATTTTCC TTCTCCAATA 4500 

AAAGCATATA ATTTAGGAGC TAAAATACAT 4560 

CGGTTACAAA CAGTTTCGCA TTCAACCAAC 4620 

ATTAAATTTG AGTTATATTT TTTAAGTCTA 4630 
CCCTTTTCTG TTAAGAATAA CGAGTCTGTA, ' 4740 

ATGTAAGGTT T AC CC ATATC TTCCCCATAT 4800 

CTGGTCCAAG CCAGCACAAA GGATGCTATC 4860 

TCTTCCTTAT CCTCAATAGT TGTTAAAATT 4920 

ATAGGCTTAA AAGTCACGTG ATCTTGATTT 4980 

TTTGTGTCTT TGGAATCTTT GGGCGGCATT 5040 

TCAAATGAAC GTTTTGGTAA T GAT T TACT A 5100 

GATCCATTTT CTAATCTTTT TTTATCTTTC 5160 

TTTTTATCTA TACGGGTAGC AAACGAACCA 5220 

CTTCGCTGAA TCTTGTTGTT ACTTTTACTT 5280 

ACATATTTTT GACATAACGG TTTCCAGTCA 5340 

ATTATTTTGC ATTTCCATCC TCTATTGTGT 5400 

TCATTTATCA ATGTTTCATT TGACCAGCAT 5460 

GGTAACACAT CAAGATAATC TAATGATGGG 5520 

TTGATTGAAT TGTCAAAATA ATCTATTTTT 5580 

TCTATTTGTG CATTGGCTTC AAAAGCATTT 5640 

AAGGCACTAG CATACATTCC GCAGATATCA 5 700 

AAAAATGAAG GATAACATCT TCCTCCTCTT 5760 

TCTGATGGAG CTTCCAAATT TCTTAGGAAT 5320 

ATTTGTTTAA ACAATGCTTG AG TAT TACT A 588 0 

GAACACTCAA GCTTTAAAGA TGTTGTACAG 5940 

ACTAATTGAG CCGTAACTAT AACATCATCA 6000 

TTGTATTTTT GGTTGTGTTT TGGTTTGTTT 6060 

TTTTGAACTG GATAACCATT GTTTTCTTTT 6120 

ATTGCCCTGT AAGGACAAT A ACCTTTGCTA 6180 

CTTAAAGAAG AGTGGGTTAA CAAAAATGTA 6 2 40 

TTCATATCTT CAAAATTAAT AATTCCATTT 6 3 00 

TTCTTAAAGC AAGGATTTGG AAGAGATAAT 6 3 60 
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GTAATATCAT TAAATAACAG TTTTCCAGCA CGAGGCATAA AGCTTCTTGT CAGCTTAAAC 6420 

ATTGAAAGTT CTTCACTGTC TATTCCTTCT AATACATGAC TTGCAAGTAT GATTTCATCA 648 0 

AAACCACAGA TATTATGACC TACTACATAT AATTGAATAT ATCTTGGTTC GCACTGTTTT 6540 

AATTTTTTTT CTTTATTTAA GACCATGATG TCTTCATATG ATAAATTTGA TTCAAGACCA 6600 

TGAT7TTCAC AAAACG7TGA CCAG7A77T7 T7AGC7ACTG AAAT7TG7AG CTC7G7TCTG 66 60 

AA77777TAA AAGCTATGCG AA77TCATC7 7C777777A7 TTAACATTAC AAAACA7TC7 6 720 

CTG77TACCT CATAACC7AT ATCGG7AGCT A777TAGAAG CAATT7TTAT GAGTGATTTA 678 0 

CATCCAAT7A AC7TAAAAAC CAACAAGTAA GGAGTTAACT G7TTTCCATA CAAAGAATGG 6340 

TAAG7A7A7G 7T7CAATA7C ATAAACAATA AAAAGACGTT TTGCTTTTAT GGC7CCAACT 6900 

GGA7TAAA7T TGATTTTTTC CCACCAGAG7 TTTGTTTCAT GGTGAATATT GTGATAATAG 6960 

AAG7CCCG7G TTC7GGATGA GCAG7TG7G7 ATAXTACTAT AAATTGTTCC GCAGAATTCA 7020 

CA77TAXTCT G7TGTTTAAC AG77T7TA7T AAATA7ATTT CTCC7TTTAA AATCAATAA7 7080 

7C7A77GG7A ACAAATTTCC AT7AAGAA7T TCTTCAGTCA TCTTAAAAAA TCTT7TGT7G 7140 

AACX7CCATA 77777AAAGA TACGGGGG7G 7TAGAA7CAC AAAGTTTTAA AACATCTAAA 7200 

ACA7777C7A CT7TCTTGAA AGAATTTAAT TT7AAACCCT GAA 7 7 GC AAA GTAATTATAA 7260 

AAAC777777 CAAAA77C7T G7AG7ATA7A A77777ATAT ATGTATCC7C ATATATTCCA 7320 

GTAA7AXAAG 7AGTAG77C7 77GC777A77 A77G7C777G AAGCCATCTG 7T7AAAGCCG 7380" 

CT7CCCG7AC 7CGCTCAAAG C77C77AAAA CAACTTCATT TGTACTATAG CCAACAAT7C 7440 

CAGACAA777 7AT7C7AAA7 GC7AT77CAA C7GAATCTAA A7CTGAAAAA TCCG7GTTTA 7500 

CT7GG77GA7 7AC7TCT7C7 ATGCTCCCAC 7GTC7TC7AC GAAGTCTATA TCTTGAAGTA 7560 

AT7GG7G7C7 T7C77C7GGA G7XGAAAAAG AGTAAGA7C7 TTCAXTAGCT TC7ATAAT7C 7620 

CTAAAAAATC ACGAG77A77 C7GCTA7A7A G7TG7CTGAA 7GCTTGTG7T TCTCTA77AA 7680 

ACCAAAC7C7 AGTAAATATA 7CTTC7CCAT 7XTCA7TTCT ACCTCTTAAT ATAATTTGAA 7740 

CAAA77GGA7 TCCAATATT7 C7GGGAGC7A ACC7A77T7G CACTAAATTT AAGTATAAG7 78 00 

AATATAGCG7 GCTTGCCACA TGC7C7AATA 7AAAGAAATA CACTAACCAT TTTTGAATAA 7860 

AATCA7CAGX CAATCTAT7T TCA77ATAAA ATG7AA7AAG TAATTGAAAA AAT7CACT7C 7920 

CG7AA7TAAA AAAATTACTC C7TG7TGC7X CAGGAG7TAA TTCT7CTTCT AAATTT7GAA 7980 

TTAAATCTAC 7ATTGAAGC7 ATCACT7CA7 CA7TAAA7TC TTCCCTACTC AGAXCGCT7G 8 040 

AGCTCGGC7G GCGATC7GAA AATCCTTCAT C77C7A7TTC AGGAACAGTA AGAGGAGAAC 8100 

TAGAAGTTTC 77CAACA77C CTTACCCTTT GGCG7CTAT7 AACAGGTAAT CTA7CAATAA 3160 

A7CT7CTGA7 7ACA7CACCC CTTGAACG7C TCAT7AT77C AGTAATAGCT CTAXAAT777 8220 

CCCTAGGTCT TAATCTGAA7 GG7AA7CC7A CTCT7GTCCC TGACCTXAAA GTTAATGCTC 8280 

CACCA7GCA7 CCCACCTTT7 CCTAAAGT7A ATACAG7TGC TAAATCT7TT AAATTAATTC 83 40 

GA7777CAGC T7C7GGAAT7 TCCAGCTG7G AAAA77CATC TA7AAAAAGC TCAAXCCAGA 3 400 

AT7CAGAAAA AGGTAAGTCT AA7A7ACATT CAC7A77ATG CATGXTAGAC AAAATTAAAA 8460 

A777ACATAA AGCTTTTTTA ATTT7ACAAA 77AAC7TTA7 AAGGTAAGTA 7CCCTTTCTT 8520 

GCAAA7TTAA AACCATAAAA GCTTGAGAAA AAGG77GATA ATGCTGCXGA AAAGATCTAT 858 0 

XG7GA77TTG AGCTGAAATA GCGGAGCGAA AACC77GCA7 G7C7GCAAG7 XGCAGACTCC 8640 

C7AATATTC7 ATCCATTAAA ACCGGG7777 GAA777GAC7 AA77G777G7 GAAAAATTTT 8700 

C7ACA7777G AATTGG7C7C AT AT AT GAG C CAGTATTTAT GGAGTATGAA CAATCAGTTA 8 760 

AAATTTGGCA GGTCATGCGT CTG7GAAAAC 7TA7AGGTGA AAGATACAAC T 7 AT AT G AAA 8820 

TGTTGCTGTA AG7CCGCTGA TCAAACAGA7 AC7GG777AA AAG7CGCGCG ACATAAAAAT 8880 

ACCCAATTAA TAAA77TGG7 GGAGG77G7C CTTCAAATGG TGGTTGTGAA GTAACAGG7C 3940 

CTCTTGGGCG TAAATCGAGT AATTGAGTGA C7GGA7AATT AAAAAATCGA TTAGGCCATT 9000 

TTAT7CGGCT 7TCATGTATA GTCCTTGACC TGGGAA7AGT TCGATTATTA AGGTCAAGTG 9060 

TTAAAGGTAA ATATGGTAAG GTA7GT7GAC 777GCCCAGT GAG7TGTTGG CATTGGTGAA 9120 

TC7GGAAGGG AAACAAAAAA 777A7G77A7 TAG 7 GC AG AT GGATCCTATT TTACAAAATT 9180 

TACGTTCATC ATTGGAAACT CCAGAC77A7 CAAGGAAGTC CCCGGGCACG TCAAATAAAA 9240 

A TG AAAAAG A TGAA777GAA CCAGCAG7TG GGA777CTAG CAAACGA7C7 GATGAATTTA 93 00 

ATATGAGACG ATGTCAAAGA GATGATAA7T TAGGTAAAAG TCAGATACCA GTAGTAGATA 9360 

TAG 7 AC AT GA TAAAAATCCT AAAA 7 G G C A G AAGAAGGAGA CTTAATGTAT AAATGTTCTG 9 420 

CTTGCATAAA ACTTGATGAT TGTAAACAAT T AAAAAG T G A TATGTTCAGG CGGGATTTTG 9 43 0 

CTGGAACTAG TCCAGCTCAA AGACACATAG AAGCGGCAGA GCTAAAGAGA AATGGATCTT 9 5 40 
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Fig 1 (cont) 

ATACTCGTAG TTTAGAACAA TGGACACATG ATTCTTTTAT AAGTCATGTT AAACAATTAC 9600 
TTTCTAGACC ATTTATATCT CTAGGTATTA CATATTTGGA TGATTTTTTG CAGACTTATT 9660 
TAGATCATAC TGAATCGTCT TCTTTAAACT TTCAACTGTT TACTTTAATA AATCACTGTT 9 720 
CAGAAAATAC TTTAAAACSG ATTTTAAAAC ACATTTCTAA AAAAAATGAA AAAAATCAAT 9780 
ATGTAAATCA ATGGTTGATT GATCTCATTA CATGTATATA TCTAATTATA AGAGATGAAC 93 40 
AAAATGTTAC AGAACAAGTT AATGCCCTTT TAGTAACTAG TAATCACTTA GCTTTACATT 9900 
TTGCAAAGAA AGCTACAGGT GGATTCTATC CTACAGCAGA CAAGTTAGCG AAGACTCATA 9960 
TTTTTTTCAA GAGAATAATT TTAGGAATAC TTTCGCTAGC AGAAAGTATA GGTTGCTATA 10020 
CTGTGAATCC ATATTGCAAA AATCCTTTGA AAAAGTCAAA AGTAGAAGTA GAACCAAGTG 10030 
ACGAAATGTA TATGTTCAGC TTAAAAGGTG CACTTGAACA TCCTGATTCC GACGAAGACG 10140 
AAGACAGTGG ACTTCAAAAT GAATAATTAT CATAAATGGA CTTCTAATGT TATAGATGCA 10200 
ATTCTATCAA ACAAAGCTCT TTTAGCTATA AAAATTTTAA AAGTCAACCG TTTGCAAACA 10260 
AATTGAATGC TTTAGAATCA GCAGTTGTGC CTCCAAGAAA AGATGATACT CCTGAAATGA 10320 
TAGCAAATCT TTTAAAAGAA TTAGTTGCTT TGGGAGCTAT TCGCAGTGAT GAAGTTGGCC 10380 
CAT TAT AT TC TGACCTTCTT ATCAGAGTTC ACAAATATAA TAGCTTGAAT GTTCAATCAA 10440 
ATTTGCAAAC TTTAACAGGA GACATTAAAT CACTTCAATC CGATATAATT AGAAGTTCCG 10500 
ATATTCCCAA TTTAAGTAAT CAAGTTGTTT TAAATACATT TTTAAATTCT TTGCCCTCAA 10560 
CTGTTACATT TGGACAACAT AATTATGAAG CTTTTAAACA AACTCTAAGA TTATTTGTTA 10620 
ATGAGACACC TAATATTACA GTTTTTAGAT CAGGAAATGA TACTTTAATT CAGGTTAACA 10680 
TAACAGGAAT TCATACAATT AATTTGAATG ATGCATTTAA AAATTTAAAA AATTTTTGGG 107 40 
GAATAGTATT AACAGGTGAA TTTATTCCAG GTGATATTAC AAGCAGACTA ACAGCTAATA 10800 
CAAGAGTACT GCTTTATTTT CTTGCTCCTT TTACAAATGA TAATACATTC ACACCTGATA 10860 
CTTTTCTAGC TTTACTCATG AAATTATATA GATTGACAGT TTCTTCTGCT TTAGATTTTG 10920 
AAGAAGAAAC TGAAGCTGAA GTAGAAAATG TAGCTCAACA AATAGGATCC ACTAGTGCAG 10980 
ATTTTACAAA GACTTTAGGA TATCTATTAA AAAACAAAGA AGAATCATTT TCGCCTCCCA 11040 
AATCATTATC TCCTAGACAA CTGGGTATTT TAAGGTTCAT AC AGAAAAG T CTGGTAGATA 1110 0 
AAATTGATAG AAATAATGAA GATCCATGGG ATGCTTTAGA AACTTTATCT TATTCATTTT 11160 
CTCCGTCATT TTATGAGGCC AATGGGCCTT TTATTAGACG GTTAATAACT TATATGGAAT 11220 
TTGCCTTACG TAATTCTCCT ACTTACTTCA GAGAAATTTA CTCCAACAAA TATTGGATAC 1128 0 
CAC"-AATTC ATTTTGGACT CAAAATTATG CAGACTTTTT TTCGGAAAAG AAAGAAAAAC 11340 
AAAATTTCGA AACATTTGAA CCGCGGGAAC TTCCTTTACA AATCTCTGAG GAAGAAGCTG 11400 
TCCCGCATAC AGAAGATTTT CAGTCAGCCA TCTCGCCCTC TATGGGCCAA ACTTCACTCC 11460 
CTGCTCCTTC TGTGTCAGAA TACAGTAGCG TGCCTCGGTC AGCTTTTTAC CCTCTCAGAG 11520 
AACGTATCCA AGAGAGCATT TCAAAGGCAG TCATCCCTCC TTTGACAGGC TATGTCGGAA 11580 
AACAAATAGG TGAAACTATT TTCCCTGGTA GTGGAGATCT TGTAGCACCC GCTGCGTCTT 11640 
TAGTTGCAGC ACAATTGGTT GATTCAAGGT TTAATAACAG AAG AC AAAG A TTGAAAGACG 11700 
CAGCCAGAAA GCGTCACCGC TATGTTAGAG AGATGCATAA TATTTCTGAT AAAGAGTCAA 11760 
ATGCTTCTAA TGATACGGTA ATATCACCTT TGATTGGACA TGGTTCGCGC ACTGAAAATC 11820 
GTTTTGAATA TTTGAGACCT AAAGGTGGAA ATTATTTATA CTAATAAAAA TCATAACAGA 11880 
CCTGACGGGC GGTCATCCTT TTTTATTAGA TGCAGAAATT TGTACCTCCA CCACGAATCC 119 40 
TTGCTCCAAC AGAGGGTAGA AACAGTATTA CTTATACGCC TCTGGCACCA CTGCAAGATA 12000 
CAACAAAAGT ATTCTTTATT GACAATAAGT CTTCGGACAT TGAAAGT7TA AAC.^ACTA .2050 
ATAATCACAG TAACTTTTTT ACAAATATTA TTCAAAATGC TGAT7TGGCA GC^AxGAAG 12120 
CAGCAACGCA AG AT ATT AAA CTGGATGAAA GA7C7AGAT3 GGGCGGTGAA CTGAAAACTT 12130 
TTATAAAAAC AAATTGCCCC AAIGTTTCAG AATTTTTTAA CAGTAATAGC TTTCTAGCCA 12240 
GATTAATGGT AGAT AAAAC T GATCCAGAAC ATCCTAAATA CGAATGGG7A CAAAT.ACAA 12300 
TTCCTGAAGG CAATTACACT GGAA3CGAAC T7ATAGAT3A ACTTAACAAT GGTATTTTAA 123o0 
ACAATTACTT AGAAGTGGGA CGCCAAAAA3 GAGTAGAAAT TGAAGACA.A ^TAAAA 12420 
TTGATACAAG AGATTTTTCA CTTGGATATG A7CC7GAAAC WGACTAATT A-T.CAGGAA 
AATATACATA TAAAGCTTTT CATCCAGATA TTATCTTGCT A-T.AAwT ^G.AGA . 12540 
TTACATATTC TAGAATTAAT AATATGTIAG GTATAAGAAA GAGA7TTCCA TA.ACTAAA- 12600 
GATTTCAAAT TTTATACAGT GATTTGACGA AGGGAAA7AT CT-^-n.iA -^AA.xTAx. x26o0 
ATAACTATCC TCATTCTATC GAACCTGTAA TGCAA3AC3A AAATGGAG. . A^A.AAw 12720 
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Fig I (cont) 

TAGAAAAAAT AAGTGACAAT CCCCCCAGAT GGCAAACAAA GTACAGATCT TGGACTTTAA 1278 0 

GTTATAAAAA TAATGGAGGA GCTAAAGCCC TAACTGTACT AACTGTTCCG GACATAACAG 12340 

GAGGATTAGG TCAAATTTAT TGGTCAATGC CAGATACTTT TAAAGCACCT ATTACTTTTA 1290 0 

CTAACAATAC TACAAAGCCA GAAACACTTC CAATTGTTGG AT TAG AT AT G TTTCCTTTAA 12960 

AAGCAGGGTT AGTTCATAAT ATAAATGCGG TTTATTCTCA ACTTTTGGAA CAAATTACAA 13 020 

ATACAACTCA AGTATTCAAT AGATTTCCTA AAAATGCTAT ACTAATGCAA GCACCTTACA 13 08 0 

GCACCGTAAC ATGGATAAGT GAAAATGTCC CCTTTGTTGC AGATCACGGG ATTCAGCCAT 13140 

TAAAAAACAG CCTTACAGGT GTACAAAGAG TTACTATAAC AGACGACAGA AGGAGATCTT 13200 

GTCCATACAT ACAGAAATCT TTGGCGACTG TTGTCCCTAA AGTACTTTCA AGTGCTACAC 13260 

TTCAGTAACA ATCTGGCTGA TATCTCTGGG CCTTATCCTC CTGGAACCGT TATGTCTATT 13 3 20 

TTAGTTAGTC CCTCTGATAA TACCGGGTGG GGTATTGGAA CATCAAGTAT GAGGGCTACT 13 3 80 

GGCTTGAAAT TTTCTAAAAA ACAACCTGTT AGAGTGCGAC CTTATTACAG AGCTCAGTGG 13440 

GGACAGCTTA ATGCTCGTAC TTCACTTGAG AAACTAAAAA CCAAATTGAA ATATTATGAA 13 5 00 

AAATTGTACA GGGACAGACT AAAAAGAAAA ACAGTTGTTC CAAAGAAAAA GAGGTCACCT 13560 

ACATCTCCTG CGGATCGACT TAAAAAATAT CTTAAAGCTG TCAGTCAAAT CAAAGCTTTC 13 620 

AATAGAGCTA GAAGAGCAGC CCAATAAATA TTATTTTTCA CTTGCAGATG AAGGTAGTTC 1368 0 

ACGTGCTTAA ATCTCCTCAT CGTCGAAGAC ATACACGTCG TTACAAAAAA CTAAAAAAAA 13 7 40 

TCAATCTATC TCCATACATT TTACCTAAAG AATTGCAAGG CGGTTTTTTA CCAGCTCTCA 13 800 

TTCCTATCAT AGCAGCCGCA ATTAGCGCAG CCCCTGCTAT AGCTGGAACT GTAATAGCTG 138 60 

CTAAAAATGC TAATCGTTCT TAAAATTTAG AAAACTTTTT TTTTAACAGA TCACATGGCT 13 920 

TTTTCAAGAT TAGCTCCCCA TTGCGGCTTA ACACCTGTTT ATGGCCACAC CGTTGGAATC 13 980 

TGTGATATGA GAGGAGGTTT CAGCTGGTCT AGTTTGGGAA ATTCTTTTAC TTCTGGTTTA 140 40 

AGAAACATAG GTTCATTTAT ATCAAATACT GCTCAAAAAA TAGGTCAATC ACAAGGATTT 14100 

CAGCAAGGCA AACAAGGTCT ACTGCAATCA AATGTTTTAG AAAATGCAGG ACAATTAGCA 14160 

GGTCAAACTT TAAATACTTT GG TAG ATA TT GGAAGATTAA AGGTAGAGAA AGATCTAGAA 14220 

AAATTGAAAC AAAAAGTTAT AGGGAACGAC CAACAAATTA CTCAAGAACA ATTAGCTCAA 14280 

CTAATAGCCA GCTTAAAACC AAAAGATGAA ATGTTTGTAA AGCAATCAGA AAAAATTGTT 143 40 

GAACCTATGA G AC GAG AAA T TAAATCTAGC CAAATGCCTG TAGAAATGTC TTTTTATGAT 14400 

TCTGTAAGTG ATGAACCAAT CAT AAAAA C C AAAGAAGTTA GCCCTCCTTC ATTTTCATCT 14460 

GAATCT7CAC ATTCATATTC TCACCCAAGA AAAAGAAAAC GCGTATCCGG TTGGGGTGCA 145 20 

TTTTTGGATA ACATGAGTGG AGATGGAGTA AATTTTAATA CAAGAAGATA TTGTTATTAA 14580 

AAACACTTTT TAT TT AC AG A TGGAGCCACA GCGTGAATTT TTTCACATTG CGGGTAGAAA 14 6 40 

TGCAAGGGAA TACTTGTCTG AAAATCTGGT ACAATTCATC TCTGCCACTC AAAGTTTTTT 14700 

TAATCTTGGA GAAAAATTTA GAGATCCTTT TGTAGCTCCA TCGACGGGTG TAACTACTGA 14760 

CCGTTCTCAG AAACTTCAAC TTCGTATAGT TCCGATTCAA ACTGAGGACA ATGAAAACTT 14820 

TTACAAAACT AGATTTACTT TAAATGTAGG AGATAACAGA GTTGCAGATC TTGGAAGTGC 14880 

ATATTTTGAC ATTGAAGGAG TTATTGATAG AGGACCTACT TTTAAACCTT ATGGAGGGAC 14940 

AGCTTATAAT CCATTAGCCC CAAAATCAGC TTTTCCCAAT GCAGCTTTTA TGGATACTGA 15000 

TGAAGCTACA ACAATTTATA TTGCTCAACT CCCTAATGCT TATAATGCTC AAAACAAAGG 15060 

TGTAGAAGAA GCAATTCGAG TAGAAGCAAA CACTACTACT CCTAATCCTC AATCAGGAGA 15120 

ATATGCTACT TATGACTCTG CCAAATTTAA TCCAGAAACT ACTGGTGCTT CTGGAAGGCT 1518 0 

TTTAGGAATT AATAGCTTAG GAGATCTTTT TCCGGCTTAT GGATCTTATT GTAGACCTCA 15240 

ATCAGCAGAT GGTAACATTT CAACTGCACC CATAACTAAA GTCTATCTAA ACACTACTGC 15300 

TACAGATGAC AGGGTCAGTG GAGTTACTGC AGTTGACACC GC AACCAGAT TGCATCCAGA 153 60 

TGCTCATTAT ATTGAATATA CTGATGAAGC C AAAGC TAG A GCTATAGGAA ATCGCCCAAA 15 4 20 

TTATATTGGT 'tTCCGAGACA ATTTTATTGG ACTCATGTTC TACAATAATG GTTCTAATGC 15480 

AGGAACATTT TCCAGCCAAA CACAACAACT TAATGTTGTT TTAGACTTGA AT G AC AG AAA 15540 

CAGTGAACTA AGCTATCAAT ATCTAATAGC AGA7C7GACA GATAGGTATA GATATTTTGC 15600 

ACTTTGGAAC CAAGCAGTTG ATAGTTACGA CCAGTATGTC AC-AATTTTGC A T AA T G AAG G 15560 

ATATGAAGAA GCCCCTCCGG CCTTATCATT TCCTTCTCAA GGTATCCAAA ATTATTTCAT 15720 

GCCTACTGCG GCAGGTAATG CGATGACAGT AGACACGGGT AG AAAT AC T G C AG C AAAAA C 15780 

AGATAACACC AAGGCTTTTA TAGGATATGG CAACATGCCA TCTTTGGAAA TGAATCTGAC 153^0 

AGCAAATCTA CAACGTACAT TTTTGTGG7C TAATGTAGCA ATGTATCTGC CAGATAGGCT 15900 
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Fig 1 (cont) 



GAAAACAACA CCACCCAACA TAAATCTACC 
TGGAAGGGTC CCTCTAGCAA ACATAATAGA 
ATTAGATGTT ATGGATACTG TAAATCCATT 
TAGGTCACAA CTGTTAGGAA ATGGAAGATA 
ATTTTTTCCT ATAAAAAATC TTTTGTTGCT 
TAGAAAGGAT CCCAACATGG TTTTTCAGTC 
CGCAACTATT ACATACACCA ACATAAATTT 
AACAGTAAGT GAACTTGAAT TGATGTTGCG 
TTATTTGGGT GCGGTAACTA ATCTTTATCA 
GAACGTACCA GATAGATCTT GGGGTGCTTT 
TTCAGAAACA CCTATGATAG GAGCAACAAA 
ACCGCTACTA GATGGTACTT TCTATTTAAC 
GGATTCTAGC GTTCCATGGC CAGGAGATGA 
TAAGAGAGAT CCTAATATGG ACGCAGAAGG 
AGATTTTTAT TTGGTACAAA TGGCTGCTAA 
GCCAGTACAT TCTAAATATT ATGGATTTTT 
ACCAATTTAT GGTAATGGCA CTTATGATTT 
CATGCAAATT TGGAATAATA GTGGTTTAGA 
CAACACTGGT CATCTTTATG TAGCTAACTG 
TGAAAACCAA CAAACTGAAA GGAAATTTTT 
TTCTAGTAAT TTTTTGAATA TGGGTAATTT 
TAATTCTAGT CATTCACTTA ATATGGTTTT 
TCTAATGCTT TTATTTGGTG TTTTCGACCA 
AATAAGTGTA GCTTATTTGC GCCTTCCTTT 
ACATCCGAAA GTGAGCTGAA AAATCTGATT 
GGCATTTTTG ATTGCAGATT -TCCAGGTTTT 
ATTAATACAG GTCCCAGAGA ACAAGGCGGA 
ATTTCTTATA AG C TAT T TAT ATTTGATCCA 
TTTTATAATT TTTCACTAAA TTCTCTTAXT 
TGTATTACAG TAGAAAGAAA TACTCAAAGT 
TTGTTTTGTA TATTTTTCTT ATACTGTTTT 
TGGCTTTTTC AAAAATTAAA CGGTTCAACC 
TTACATGAAA ACCAGACATT TCTTTATGAT 
AAAAATTATA GAACATTTAT TGAAAATACT 
TATTCTTGCT TTTTGACGTT TTCATTAGTC 
TTCCAAGATG GTTTTTTTTT TCTTTGATGG 
CATATCAGAA TCCTCTTCTA TGTTAGGCAA 
CCACTTAAAT TGAGAAAACT GAATTGGAAT 
ACGCACAAGA GTTAAACACT GTAACATATC 
TCCATTATTA CGTCTCAAGT TGTATTGATA 
GAATGTAACT GCTGCGGCCT GAACTCTATT 
AGATATAGAA AATGGAGTTA TTTTAGGGAG 
ATTACATTCA CACTGACCCA ATATAAAAAG 
AGCTTTTGTA GTTTCAATGG CATTTTGCAT 
GTTAAGACCA CAACTGCGAG GAGAACATTG 
TAACACGTAA TGTTCCTGAA CTATTTTTAC 
AACACCCCTC CCTTCCTTTA GGGCTTGCAC 
TTCATTCACC CTTTTAAACA TGAAGTCACC 
ATCATGATAC CACAAATAAC AACCAGAAGC 
ACAAATTGCA CTATATAGCA TTCTACCTCC 
AGTCAAATTT ATAATTTTCA TCTTTTTCAT 
TTCTTCAGGA TGAAACTTCA TTTGACTGGT 
TAAAATTTCG AGCGCCGCTT GAACTTTATT 



TGATGACACC AACTCTTACG GAT A TAT AAA 15960 
TACATGGACT AACATTGGGG CTAGGTGGTC 16020 
TAATCACCAC AGAAATTCAG GACTAAAGTA 16080 
TTGCAGATTT CACATTCAAG TACCTCAAAA 16140 
GCCAGGAACA TATAATTATG AATGGTACTT 15200 
TACTTTAGGT AACGACCTTA GAGCAGATGG 16260 
ATATGTTTCA TTTTTCCCTA TGAATTATGA 16320 
TAATGCTACT AATGATCAAA ACTTTGCAGA 16330 
AATCCCAGCT AATACAAATA CTGTAGTAGT 16440 
CAGAGGATGG AGTTTCAATA GAAT'TAAAGC 16500 
AGATGCAAAT TTTACTTATT CAGGATCTAT 16560 
ACACACTTTT CAACGAGTTT CTATTCAGTG 16620 
TAGGCTTTTG ATTCCAAATT GGTTTGAAAT 16680 
TTATACTATG AGTCAAAGTA CTATCACAAA 16740 
TTATAATCAA GCTTATCAAG GTTATAAATT 16800 
AGAAAATTTT CAACCTATGA GTCGCCAAGT 16860 
ATATACTGCT TATATTACAA ACCAAAGAAC 16920 
ATCTAAAACT TCAAATCCTC CTATGTTATC 16 98 0 
GCCATACCCT TTGATTGGAC CAAATGCTAT 17040 
GTGTGATAAG TATATGTGGC AGATACCATT 17100 
AACAGATTTA GGGCAAAGTG TTTTGTACAC 17160 
TACTGTGGAT AGTATGCCTG AAACAACTTA 17220 
AGTTGTTATT AATCAACCAA CAAGAAGTGG 17280 
TTCAGCTGGT AGTGCAGCAA CATGAGCGGC 173 40 
TCATCATTAC ATTTAAATAA TGGATTTTTG 17400 
CTGCAAAAAT CTAAAATTCA AACTGCTATT 17460 
ATACACTGGA TAACATTAGC ATTAGAACCC 17520 
CTCGGATGGA AAGACACTCA ATTAATTAAA 17580 
AAAAGGTCGG CCTTAAATAA CTCAGACAGA 17640 
GTTCAATGTA CCTGTGCGGG ATCGTGCGGC 17700 
CACTTTTATA AACAAAATGT ATTTAAAAGT 17760 
CCTTCTCTGA TCCCATGTGA ACCACATCTA 17820 
TTTTTAAATG CAAAAAGTGT TTATTTTCGA 1788 0 
AAGACTGGAT TAATAAAAAC ACATTAATTG 17940 
TTCATCTTCA TCTTCTTCTT CACTGCTAGA 18000 
AGTAGGCTCT TCAATAGTTC CAAAAGGATT 18060 
CATAGTATTT TTAACCTGGA ATGACTGATT 18120 
GTTATTTCCC ATACATTCAT TCCAAAATTT 18180 
TGGCAAGCTA ATTTTCATCT CACAAAATTT 13240 
GTTACAACAT TGAAACACAA AAACAGCAGG 183 00 
AACATCCTGA ACATCAATTC CTTCCACTCC 13360 
TTGTTTTCCT ATTGTTTGTT TGCCACCATA 18420 
CATATTTCCG ACTTTAGCTT TCGGAAACAC 18480 
AGCCAGCAAG GCCTTCTTTT CATCTGAAAA 13540 
CCCAAAACGC TGATGGGCAT CCTCAGCACA 13600 
TACTTGTTTA TTCATACGCC CATTACTAAG 186 5 0 
CCCTGCTTCC GATGTTGGAG GCATTTCAAT 13720 
ATGAAAACAT CTAGGACGGT CCTCGTCCCA 13730 
ATTAAAGTTT GGAATCAAGT CAATTTGCTT 13840 
T AC AG T AG CC ATAGATTTAC TGCTACTATA 18900 
GTACTGAGCA AATAATTTTT CACAATCTCC 13960 
ATCAACTTTA ACACACTCTC CAAATTTAGC 19020 
CTGAAATTCT TCTGTAGTAG ATTTTCTCTT 19030 
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CTTGATAGAT TTAGTAACTT TTTTAGAAGA CATTATGTTA GTTTTTTTCT CGTTGTAGGA 19140 
TGGCTGAAAA AAA T AT GGG A GAGTCAGAGA AGGGTTTGAA CGAAGAAGAA TTTAACTCTA 19200 
TTCTATCAAA ACATCTGGAA AGACAAATTA AAATCTGTAA AGCGTTAACA TCAAAATTAT 19260 
CGAAGTGGAA TA7TGGAACA TTGTTAGAAA ACTTG77AT7 TTGTCCTGAT GAAAGACAAT 19 3 20 
CATCAGGTGA TCCCGACCCA AAACTAAAC7 TTTATCCGGC TTTTTTAATT CCGGAATGTC 19330 
TTGCATTGCA CTATCCATTT TTTCTAACAA CTCCTATTCC GCTATCATGC AAAGCGAACA 19440 
AAATAGGAAC TAACACTTAC CGAAAATGGA TGAACAATCA AGTCCTGGAT TTACAAATAC 19500 
CTTCCTTGGA AAATTGCAAA TGGGATGATA GGTTGGGAAA TGTAGATTTA ATTGAAGAGC 19560 
TTAAAGAGAA CCAAAAACTT GTTTTAGTAA AACAAGACCA TGAAAGAAAT ATATGGTTTA 196 20 
AATCAAAATG CAAACAACTT CAAAGTTTCA GCTATCCCTC ACTCAGTCTG CCCCCAGTTT 19630 
TACAACAAGT TTTAATTGAA TCTCTTATGG GGATTAGTCA GGATCCTAAT AACTTTGACA 19740 
AAAATTACGA ACCTGCAATA ACTCTAGAAA AACTACAACA TGTAAACTGT GATCAAGATT 19800 
TAAAACAAGT TCAACAAAAA GTATCTTCAG CCGCTACATA CGGAATACT7 TTGAAATGCA 19860 
TTCAGACTTT ATTCAGTGAC AAATTATTCA TTCAAAACTG CCAGGAATCA TTACATTACA 19920 
CCTTTAACCA TGGTTATGTA AAA! TACTIC AATTTTTGAC AAATGTCAGT TTAAGCGAAT 19980 
TTGTAACTTT CCATGGTTTA ACACACAGGA ACAGACTCAA TAATCCGCAG CAACATACAC 20040 
AATTGGCAAC CGAAGACAAA ATAGACTATA TCATAGATAC AGTGTATTTA TTTTTGGTAT 20100 
TTACGTGGCA GACAGCAATG GATATTTGGA ATCAAACATT AGATGATAAA ACAATAAATA 20160 
TAATTAAAGA GGAATTAAAC CAAAATTTTG AGAAAATTGT CAAAGCTGAA TCAGTTGATG 20220 
AAGTTTCTGA AATTTTAAAG TCTATTATTT TCCCTGAACT CATGCTGCGA GCTTTTTGTT 20280 
CTAATTTACC T GAT XT TATA AATCAGAGTC AGATATCAAA TTTTAGAAAC TTTATCTGCA 203 40 
TTAAATCCGG CATACCGCAG TCAATTTGCC CCCTATTACC TTCAGATCTA ATTCCTTTAA 20400 
CTTTCCTAGA AAGTCATCCA ATACTCTGGA GTCATGTAAT GTTACTAAAT CTTGCTTCAT 20460 
TTCTAGTAAA CCAAGGCAAT TATTTGCATG AACCCGAAAA ACCTTTAAAT ATTTCATCAG 20520 
TTTACTGTAA TTGTAATTTA TGCTCTCCGC AAAGAATGCC ATGTTACAAT AGCAGTTTGA 20530 
TGCAAGAAAT ACTAACCATT GATAAATTCG AGTTCACAAA CTCTGATAAA ACAAAACAGC 20640 
TAAAACTGAC CCTCCAAACT TTTGCTAATG CCTATCTTAA CAAATTTAAC TCAGCAGAAT ' 20700 
TCTACCATGA CCAAGTTTTA TTCTACAAAA ACTGTAAAAG TAAATTTTCT AACCAATTAA 20 760 
CAGCTTGTGT AATAAAAGAC GAAAAATTAT TGGCTAAAAT AGCAGAAATT CAAATAACGC 20820 
GGGAAAAAGA ACTCTTAAAA AGAGGAAAAG GAATTTATTT G GAT CC AG AA AC AG GAG AAA 20830 
TCTTAAACAA TGGAGAAGCC ATATCATCCT CTGAAAACTT CCAAAGGCAA AGAACTAGCT 20940 
ATGCTCTACC ATCAAATGAA GGAGAGCGAG CTGGATGGGA AGCCGATGAG CGAAGAAGAC 21000 
GAAGGAGAAG TGAGTGAGGA TGAAACAGAG ACAACAATTC CAAAGAAAAT GAAGTTTACA 21060 
AGTAAGTAAG CTCTAAATTT TTTATATTAA AAACTGAATT TTTTTAGACA AAATTATTTT 21120 
AAATTAAATC TTTATAGCTA GCAGTTGATC TTTGTTCGTT TTTCAGAAAA CTCAAGTGTT 2118 0 
CAGTCATATC AAGTTCACTT GCCTCTGAAA CACGAAATTG CGGAAATTCT AGAAAAAATT 21240 
AGACTAGAAT CTAAAAAATA TCCAGGAAAA GTTTATCAAA TAAGAAATAG AACTCCAGCA 213 0 0 
AGTATTACAA AACGATACCT GTATGAAAGA GATCTGAAGA AACTGTTCCA GTATCTAGAA 213 60 
GACGCAAAGA AGCTTTACGC TAAGTACCAA AGCTGAGGCT TTATAGTTTT AAATTTTCCC 21420 
GCCATGGCTC AACCAGTGAC GCCTTACGTC TGGAAATACC AACCAGAAAC AGGATATACT 21430 
GCTGGAGCCC ATCAAAATTA TAACACTGTT ATCAACTGGT TGCATGCCAA TCCACAAATG 21540 
TTTGCCAGAA TTCAACATAT AAACACCGCA CGCAATGTTA TGGACAAATT CCGCTCTGAT 21600 
TTGACCCGAG ATGACATCGC GGTTAACATC AACAACTGGC CTGCAGAGGA TTTAATGCAA 216 6 0 
CCTCCTAATT TTCCTTACAT TCCTGCGACC TCTAAATCCG CTTCAACCAT AAATGACTGG 21720 
TTGGCTACCA CTCAAGGAAT TCAACTCAGT GGAACTAG7G AACTAAACGG GTGGGGATCT 21730 
AACCGCCTGA CTTCCTATCC GGATATTCCA CCCAT77TAA AG TAT G AAA G GCCTGGTCAA 218 40 
CAACTTCAAG GGCAAGGACT TTTTAAGCAA GAAAATATTC ATTTATTTTA CGAATCTCCG 21900 
CGCCTCCCTC GCTCTGGAGG ATTAACTCCC CAACAA777G TAAAAGAATT ICCGCCTGTT 21960 
GTTTAT AATA ACCCCTTCTC AGAATCTATG AGTGTATTTC CGAAAGAATT TAGTCCTTTG 22020 
TTTAACCCTT CAGAATCTTT G AAAAAAAC A ICC AG TC AAA CTTTACAATA T AAAT AAAAA 22030 
ACTTCTATTG ATCTTTATAC TTACACTAAA GCATCGCGTT TATTTTCGTC GCCATAAAAA 22140 
TATATCAAAG ACCCGTAATT CTCTAACTTT AAATCA7777 TTGAACTAAT CTTAAICCAT 22200 
TTAAATGTAG GAATTAATAT AT C AG AAA CC AGTAACAAGC CAGAA7TAAA ATATACTTGT 22260 
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Fig 1 (cont) 



GTCATTTTTA CAGATGAAGC GAGCACGCTG 
ACTGGTTCCT CTGCCTCCTT TTATTGAAGC 
C77A7C777A AACTTTACTG ATCCTATCAC 
A77GGGAGAT GGAATATTCA 7AAACGGAGA 
CAAAGTTCCC CTGACTGTCT CAGATGAAAC 
AACTGAGTCA GACTCTTTAG CTTTAAAACA 
GGGGAGTTTA GTATTGAACT TAAATACTCC 
AAATGTTTCA AATCCACTAA AGATAGCGGC 
CCTAGGATTG CAAAATGAAA GTTTGGGCTT 
AGAAGGAAAT TTAGGTATTA AATTGAAAAA 
CTTAAACTAT AAGAATCCTC TCGCCATTAG 
TCCATTAACT GTTAATACAA GCGGATCTCT 
TTCAAATAAT GCTTTATCAT 7A777A7AGG 
TTTAACTGTA AATTTAACTA GGCCTCTGGT 
CTCAGCCCCA CTAGTGTCAT TGCAAGACAA 
TGTAAGCGAT AATTCTTTAA GA77G7C7C7 
AAAACTTAGT GTAAACTATT CTAATCCTTT 
TGTTAAAAAA CCTGTAATGA TTAACAACAC 
CATAAAATTA AATGATGCAG AACAGTTGAC 
CGATAACGCT CTAAAACTGA AACTTGGAAA 
CTTAAACCTT GGAAACGGTT TGACTTTCCA 
CTCTCTAGGG TTTAATGCTT CTGGGGAATT 
CGT7AACTTT CTAAGCACAA CTCCTATAGC 
AGCTTTCATT 7A7A7777A7 CAGGAACACA 
AGGTTTTCAA CCCCCACAAG ACTTTTTGGA 
TGTAACTCAA ATTGTGGGAA ATGATGTTAA 
ATCTACCATA AC TATGAAAT TTACTTCTCC 
TACAGCACAT CAATTCAGAC AATGAATATT 
TTACATACCG TTCTTGACAT AATGTGCCTC 
GATCATTGGA ATCTATAGAA GCATAACTCT 
GAAAACCCCT TAAATCTACC ATATTCATCT 
AATCTTGCAC TTCTGGACTT TTAAAAACAA 
GGTTATAATC TGTTACAATT TTACTTATTT 
TTGTTATAAG TACTGTAAAA TCATCAAATG 
TCCAAGGTAA AAAAACAGGC ACACGAACAT 
GTTTTAAACT TTGACATTGC AAAGAATTTG 
GCTGACAAGG TAAGTCACAC AAATACAACT 
AACTTTCCAA GACTTTAAAA CTAACAAACG 
CTTCGCAACA CATAATGGAG TTCATGCTAC 
AATTAAAGAA CAACAATACA ACA7ACGAAG 
ACATTGCTGC AAAG7A7C7G AACA777ACA 
AAATGTAATT CGTTTAACAG TTTGATA7GA 
TTGTGCATTT GTAAGC7CCC AG AAA C A7 7 A 
ACAGGAACAG 7C77AACG77 7CG77CAGAA 
GACAA7AAAA CAC7777GGC AGC7AAACAT 
7GA7AA7AAA AC77A7AAGC CA7A7CGGCC 
A7AGGAAAA7 AAC AAAAAAA C7GA77A7A7 
7ACACACGAA TAGCAGAACC AA GACGACCA 
7GACTAGGAA CAGA7GG777 C7CAGAAGCA 
AA7CAGGC77 AA7AGGAAAA G AA G AAAAA7 
ACG777CA7C C7G7ACA77A CTAG7CACAA 
CA777AAAAC 7CCCACCAAA 77G7CCCAG7 
CAAAA777GC CCA77T7AAA 7AA7GCAAAG 



GGACCCGG77 7A7CCC7777 C7GAAGAGAG 22320 
CGGAAAAGGG C7AAAAAGCG AAGGG77GA7 223 SO 
7A7AAA7CAA ACCGG777C7 7AAC7G7AAA 22440 
GGG7GGCC7A 7CAAGCAC7G C7CCAAAAG7 22500 
A77GCAAC7G C7A77AAG7A A77C7C7AAC 225 60 
ACCGCAACTT CCCG7AAAAA 7AAA7GA7GA 226 20 
777AAA7C7A CAAAA7GAGA GA77GAG777 22680 
AGA77C777A AC7A7AAAC7 7AAAGGAACC 227 40 
AAA7C7AAG7 GA7CC7A7GA A7A7AAC7CC 22300 
7CC7A7GAAA G77GAAGAAA G77C777AGC 22860 
7AA7GA7GCG 77AAG7A7AA ACA77GCGAA 22920 
AGGAA7A7C7 7A77C7AC7C CC77ACGAA7 2298 0 
AAAACC777A GGA77AGGAA C7GACGGC7C 23040 
A7G7CG7CAG AACAC 7 77 GG CCA7AAAC7A 23100 
7C77AC777A AG77A7GC7C AACCA77AAC 23160 
AAA77C7CCA C7AAACACAA A7AG7GA7GG 23220 
AG77G7GAC7 GAC7C7AA7C 77ACCG7CAG 23 280 
AGG7AA7G77 GAC77AAGC7 77ACAGC7CC 233 40 
777AGAAACC AC7GAGCCC7 7GGAAG7GGC 23 40 0 
AGGC77AAC7 GTTAGTAATA A7GC777AAC 23 46 0 
ACAAGG7C77 77ACAAA77A AAAC7AA7AG 23520 
A7CAACAGC7 ACAAAGCAGG GAACCA7AAC 23 580 
7777GGG7GG CAAA7AA7AC C7AC7AC7G7 23 640 
A777AC7CC7 CAA7CCCCAG 7AAC77C777 23 700 
777C77CG77 77AAG7CCG7 77G77ACA7C 23 760 
GG77A77GGC C7AAC7A777 C7AAAAACCA 23 820 
C77AGC7GAA AA7G7ACCAG 77AG7A7G77 23880 
77AAAAA77C 777A77AAAG AG7AA7C777 23 940 
7A7AA77AAC AAA7C7AAGC AAGCAAGG77 24000 
7CCAA7AAGC A7AA7CA7A7 GGCGG7AAA7 24060 
77AAG7G7AC AG7A7C7AAC AGG77777AC 24120 
ACAG7AC777 CA7AGGACAA CAA77G7AAC 24180 
C77C77CCAA 7GGCAAAGCA 77CCAAAG7C 24240 
AA7AACA7AA CACA777G7A CAACAA77GG 24 3 00 
GAAC777777 7AAAA77AAC A7CAG7G7C7 24360 
GC7GCAAGCA A7GACAA7GA AA77GA7777 24420 
77AACAGCC7 AAA7A7AACA ACA77AA7G7 24480 
GTATATCACA A7AAAAAAGA 7GA7GAA7CC 24540 
A7CCAAAGA7 GG77CCGACA AACC7C7G7A 24600 
AAAA77AAAA CG77777CAA AACGAGA7A7 24660 
7777A7AC77 A7AAGC7CAC AAG777CAGA 24 7 20 
A7ACCA7777 G AAG AAAAA 7 AGAAA7AG77 24 780 
ACGGACAGGC AAA7CCAAG7 AT 7 AC AAC AA 24840 
AACAAAG7AA CAGGCA7A7G A77AAAGCAA 24900 
7GCAAAGA7C CAGG7GAA77 ACAA7GACAA 24 9 60 
C7C77GCAAA ACGAA7CAGC 77777GGC77 . 25020 
A7GAA7GGAG 7TAA7A7CT7 CTTCAAATTA 250 8 0 
CGCCCAACAC A G G 7 AAA 7 A 7 77CAAG7CCA 25140 
ACAAC777GA 777GC77A7C CA7CAC7GCC 25200 
AA7777CCCA A 7 AA 7 AA C G A AAGAAA77CC 25260 
A7ACAACC7C CGCTA7CAAA GA77CCC7A7 25320 
CTACCTCAAA AAAGCCAG77 CCCA7A7777 25380 
CA7CAAA77C A G G AAA C AAA 7C777C7GAG 25440 
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Fig 1 (cont) 

CTAAAACATA TACAGTTTTA TCGCCATTAA ATCTAAAAGC CATCCTAAAT GGACCTC7AG 25500 
CCCAGTAGTT TAAGTACCGG GAAGAG AC TA TACAATATAC TTGATATTGA TGTCTGTTAA 25560 
GTGGTGATAA AAAAGAAAGT AATTCAGAA7 TAGGATAAAG CATTCTCCCA TGTTGATTCA 25 620 
TCTACAAAAA ACAAAAAAAT TATAAGGTTC ATAGAAAACC 7ACTATTTAA CAAATCTATA 25 6 30 
AAAATGCATT AAAAAGTTAC CTTGAATATA AATTCAGATC AC C TAAAAAA CGAAAAAAAA 25740 
TAACATTTAT GTTAGTAAAT GATAGTCTTT AAAAATTAGA AAAGAATCAA GTCGCTTTTA 258 00 
TACTTACAAA CTCCAAATAA ATTCTGTAAC CAAGAGAAAA A77G7AACC7 AAAAGG7AAA 25860 
GAAGAACA77 A7AAGAT7AA AACCACTCTA AAA7C7GAAA AGCA77A7GA AAAA7TC7GA 25 920 
TAGC7GCAAC 77AC7AG7C7 7C7CCAAA7G TTGCAGGCA7 T7CAAAAAA7 CAAGAGGAAA 25980 
ACCGGAGT77 ATAAAG7AG7 AG7C7GAT7A TA7C7GAAAA AG7TTAACT7 CC7TTTCAAC 26040 
CCAACCCAG7 CCAA7AAAAT TCCAACC77A AC77C7T7CC TGCTAAAACT CCATAAAAG7 26100 
CCAA77ACCA C77GAC777T A777AACC7C AA77A7G77A CA7G7TATTC TACCCA7AAA 26160 
AAC77GATGA CCAAGAAC7G ACC777CCCA 7G77777C7G AAATAACAAA AA7GTTGA7T 26220 
TAAAGA7T7T TAACTACCCA AAAAACCCGC TC7CATGA7T TTTTCTTATA TAAACAGGAT 26280 
ACAAAAGAAC TGGCAAAGAT AT7CCATCA7 AC7TCTCCAA C7G7CAAAAC A7ACCACTTA 263 40 
ACC7C7CCCA TG777777CC C7777GCACA AACAGGA7A7 AAAAAATAT7 T77GCCACAA 26400 
TGT777TCC7 T7TAC7CAAC 7GCCAGAA7A AAAA7GAACA GC77AACC77 777CCCTCTT 26460 
AACCCAT7GC GTTCCTCTAA GAAAAAAA77 ATCCCGCCCA ATATGC7AAA GGC7TC7CCC 26520 
GCCAAAACAG C7CAAC77AA AA7CTCTCA7 GAATAAAAC C CAGAGAAAAT T7CCAG7AAT 26580 
AAAAA7TAAT AACCG7GAAG TAC7AGA7C7 AA7AA7GA7A TTTTGAACTC A7AAAAA7CC 26640 
ACCA7CCA7G TAA7G77ACA AACAC77T77 TAT7GAG77T T7TCTTACAA C7GCAT7ACA 26700 
TACAGGCCAA GCATCAAACT T7C77C7G7A T7TCT7CC7A GACCACAAAA TTACAGAC77 26760 
ATA777C7GC CACAAATCTC TA7GA7C777 ACAGTAACAC TTACATTTAA ATGGGGAATA 25320 
CAGCAGCAAA TAAGGATGAG TTAAACA7GC GA7ACAA7GA CCAGAAGGAA GATAATACAA 26880 
TACA7CACAC CAAAA7GAAG G7ACAGACAA CA7CGCATGA AATC7TAAAT GTGAT7TTAC 25940 
AATAAATT7C TGCAGCAGCT TACAA7C7A7 AT7AGCAAAC CG7TT7ATAT ACAAACATAA 27000 
AAAC77GGAA C7T77CACCA AC7CAA7CA7 GT7A77A7AA CACATTACAA AT7TTGC7A7 2706 0 
ATC777AT7T GTCAAA7AAC AAAA7A7C7C AA7CCACAGC TCATC7GGCA GCAAAC7TCG 27120 
CAAA7CCATG ACCTGTAAAA GA7ACAACAG AAAACAGAAA ATTAA7GCCA T7CAATAACA 27130 
7AAAAAA7AC AG7CAAA7CA CA7AC77777 C7CACT7ACA AAAC77TG7G AGCAGGCC7C 27240 
CAAAACAAAC 7TCAGAAAA7 GGA7GCA7AC AAGAACAT7C TCC7C7CAAA AAT7GC7TTA 27300 
ACTGAA7GCG GCATT7TGCA CCTCCAGAAA AATGCAGTCC ATTGAGAGGC TCT7CTCTTA 27350 
AAACACAGAA A7GCT7CTGC AAAA7C7G7A AAGAAAC7AA CAACTTCCAA A77CCAATCA 27420 
TCA7GCA7TG CAAAGAAGGA CA77CAACAG CAAAAGGA7C GTGATGAGCC AA7AAAGCT7 27 480 
TAC7GTA7GA CTCATTTTCA TGAA77ACAG TC7GTAAC7T ACTA7AATGC A77T7AAGC7 27540 
C7GC7TCACA AA77AA7AA7 GC7AA7T7C7 77AAGCAGC7 CAAAGAAAAC 7CA7CAGGAC 2 7600 
AACGGCATTT AAGAAAGCAA CAAAA7GA77 TCTTAAAATA CA7TTTTCCA GCATGA7GAA 276 60 
CAA7AAAAAA TTTCAACGTT AAACAA7GCA AAAA7GCA77 777A7GCACA GTGAAAGTAA 27720 
777777CAGC TGAAGC7AAA TCACAGCC7A 7777A77ACA TGA7777GTA 7GC7CCAAAA 27780 
GAGC77GTTT TAATTGCT7C AAA7CCA7C7 7C77ACAA77 7777C77777 A7AAACACCA 27340 
GAACCGCATT CAGGCCAA7T CCAG77A77G T7IAAA777G C7ACAGAAAC 7GCAGACCAC 27900 
AAAACCACAT CC7C7AAA7C AACCCACAAA GAT C TAT GAT CCACACAAAA ACACAAAGAA 27960 
TGATACGGAG AATACAACAA TAAA7GGGGA 77AACAAGGG ACGCAACACA ATGACCCGAA 28 0 20 
GGTAATAAAG TTTTACAGCA CCAA77ACAA GCAACAGGTA ATGGAGTATA T7TCCCAA7G 23080 
CGACGAGAAA GCCGAA7G7C ATTCAGAACA GCATTGCA7T TTA7C7TC7C AAACCTCTTA 28140 
AGG7GCAA77 G 7 A 7 AAAA 7 A AGAATCCTTA A7GACAG7GA TGAATTGAGG AAAAG C AAAA 28200 
ACAAAAC7AG CAA7G7C777 GCTTGTAAG7 773 AAAAA 7 A TCTTCATCCA AATCTCAGTC 23260 
GGTAATTCAA CAAAAAA77C AGGGGCCTAC AAAA77AA7C AGACTAATTT AATATCATCT 283 20 
7G7AAACAGC G AAAA G AAAA AA7AACACAC C C AAAAA 7 AA AAAAC7CT7A CCCCTGTTAT 233 3 0 
CCATCGAGAT AC AC AG AAAA ATTCAGAACA CTCAG7G7CA 7G777C77AA A77G77CCCA 23440 
AAGC7CAGAC ATTCTAAGCC AAAAA77777 TGAGAAC 7 GC AAAAAC C C AG 77777A7AAC 28500 
AAAGCC77AA TGTTTTCTTA ACTGATTTAA CTGCCCTAAC AGGAACTCCA CATTCCGGCC 23560 
ACCGCCACCC AG GGG AC AAA TCTTGCCAAG AACTACAAGT CCATAAAACA ACATCCTGCA 23 6 20 
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Fig 1 (cont) 

AATTATACCA AAGGTTTCTA TGGTCGACAC AATTACAACC TGACCTAAAA GGTGAATAAA 28630 
GCAGTAAATA AGGATGAGTT AAACAGGCCA CACAATGTCC AGAATGTAAA AAATGCTTTG 28 740 
TTTGGCACCA ACCAGACCAC AGCTGAAGCA AAGGAAAATT GTAGCGAACA CATTCTTCTC 28300 
GTAATCTGTT TAACACAGAA CAACATTCAA TTCTGGCAAA CCTCTTTAAA AAATGTTTTC 28860 
TGAAATATTT CTTTAAAATG ACAGTTTGCA ACTCTGGAAA ACACAAAATA AAAGCCGCAA 28920 
TAT CTC TACT GCTTAAATAT AAAAATATCA TTGTCCAAAT TTCTACTGGT AAAACTGAAA 23980 
GCATCTTCTT CCTATTAAAA AAAGAAAAGT GTTTTCAAAT TATATTAGAC TCTAACCAAA 29040 
AAAATTCAAA TACTTTTCCT TTATAATGTA CATTAAGAAT AAAAATATAC TCACCGTTTA 29100 
AAAGTAGAAC TTAACAGTAT AATATAAATA CAAGTGAGCT GAACAACGAC AGCCGATTTC 29160 
AGCCGGAGCA AAATTAAAAA GAATAAAAGG ATCAAACCAA CACGTAGGAC AGTCTACTCC 29220 
AAAACAGTAA CGGCAGTATG ACACAGAAGG AGAGGAACTA AGTCCAGGAA ACTTCGCCCG 29280 
GTGCGATAAA AAGTAACGCC GCCGGAAAGC AGTTGAATAC AAAAGAGGTA AAAATTCACG 293 40 
AAAAACAGAA GCAAAAACTA CTAAATCTGC TATTGGCAAA TAAAGAAAAA TTTCAAACCA 29400 
TATTTCCAAA GGAAGAAAAG CAATCATACC GTAGAAGAAC CTGAAGGCGA CCGCAAACGT 29460 
GCTCCCGTAC CACAACGTCA CACGCCACAC CCACTGGGAA AACCCACACG CCCCGCCTCT 29520 
GTGCAACGTT ATATATATGA ATAG 29544 
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Figure 1? NUCLEOTIDE SEQUENCE OP P1ASMIO pOAVlOO 

Kpnl site (with 3' terminal seqaence) 

CTATTCATATATATAACGTTGCACAGACGCGGGGCGTGTGGGTTTTTTATTGTTTATTCT - 
CATG GAATTTACAAAGAAGTAAGTTGTT GGAT CTTTATTCACAATT CTTTTAACAAT GAC - 
TTTT TTACT* 1 * ATTACATT TTTCATCTTTTTTACTT CACATGATATTTTACTTAAATTTTG *■ 
TACATACAAGCCAAAATT CGCATAAAATGTCTTACTTTAAAAAGTTAAATTTTTTTTTTA " 




TCCTGCTGATGCCGCTGuAfiflivw *™ 
TTGTTCATCTGCTGCTTTTATTAXATCTTCTGCCAATCTAGGTGATATTTGCTTTTGAAT^ 

CKTTGTTTCCAAAAGCTTGCATCATCGCATTW 

tcctAaaaaatagcccaacccatctaaagcagttaaaagtattctccctccaggaaccac 

agatataattaagcggagcaaccgagaggttaaattcgagggtcct^ 
taggatcaggccaagaagtgaaccaaaaagactxgtaagtagaagttgtct 

tg<sagaggactgtAaaaattgcaaaacggtatctaatgaccatttcttctttacttttac 
atctgtatcatgttctccatcagaaggtcttattgggaagtaccattggtcacgaggatc 
tctgaagacttctgtttcttgaaattctgttttcggtaagcgactagcagttatggtatt 
aggaatattgacggtaatgttattc^^tctacaatttctggaggaatccatcttgcata 
sgatgaaatgggttttgtgoottctttcaatatataattgcgaggagggtttttccaaaa 
tctctgaacataagtattttctgattttggcggttttttgctttttcgcgctctttttct 
tggctttggtctttgaaattttttcttcctttttctgtaggctcctcctgctaaagcxgt 



gttatttgtgacgtacatcctgttagctacacgattttcccggactgcaaatttttttgX 
caaatggaaaagaaattgctgaaaccxxctattaat^ 

GJWVrafVGATAGrGCMOATTTTTTCTTO 

GATCAAGTGTCTTGGATAXGTTTAAGAGATATAACTCTTCATTGT<3ATCGCATGTGGTTA 

gcggtttgtttttgtttgtgcaaatctaaatttgatgtacacaatattctagcgggagta 

CAT GTTAT 3 TAAT GAAAATGACGTC GG GGATT GAAT GGAT T GAGC CTTAT TTGACATTTT 
TCTGTGATTTTXTTGCCTTATTAGGAAATAAATTTGTGGCGCCAGTACGATGGAGATTGG 

AAT GACT C C T<3 CATTTACAGAAAG GAATT T GTACT GTGT T T T GCTT GACT T T AATTTAAG 
ATWTATCAGCAGATATTTAACCCAAITATGGATTAAGCCAAATTTATGGGCTTTCTCTGA 
TTTTTTAAAAAAAATGGCCTTTATTTATGCTAGCGACTTGGCGTTGTfAAATTCTTACAT 
CCCTGGTAATGTTTGTAACAAACTTGAXATCATCAAGAAAGATCTTCCTGAAGATTTTAC 
CGTGTCTATGTTTTGXGTCTTAGTGTGTTGGCTTGCTTCTTTCTGTAAAGGTTCTAAtTT 
AGC T GAAACT G GCCAGAAT T GX CACGC G GTAAGCAAATTT CX G GCACAACTAT CAAAAX T 
AATAAAACCCTAATT TTT AGTTX GT AAAAATAG AATT CAAATTTT TAAC GCCACAAT GAC 
TTCGGCGGAGTTTTCTGTTGAATTTCCTXATGTTTCTAAGCCAATTGTTCCATGGCCTGC 

TTC GGCAT CTT CTAATAAXTCAT C GAGT CAGAATAT T GACTTTCC T GTX CTTAAACCAGA 
TCAAGATCCAAXAGCCTTCTTTCAAACTAACAATACGGCTTACTTACAACCTGGAGCT 
TTATT ACTGGAAGT GTAXCGAACT GTCAAAGCCTATTCAGATTTACGGTCAAGGAGCTAC 
AGTACAACT X GX CG GAC CX GGACCT GT GTXT GTTTT CAACAGX GAAAGX GTTATT C CXGA 
AGATTTTTACGTCGTGTXTGAAAAXATCAACTTTATTGAAGATGAATXXCCTATTAGAAG 
TGGCCAGTTAAGTTTAQGACTTACAACTCACAGTGCTGTATGGTTXATCAATGTATGGAA 
AACTTCAATAGTCAAXTGTAACXXTAAAAATTTTAGGGGAGCGGCTCTTXGGTATTCAGA 
— „ «* ^ * * ^mwmnsm Am/^rzariaaaaTMAATCAtSCAGCAXTTAGTTT CAAATTGTCG 





AAAX GAX T C GGTAAT AG GAACAT T A 1 i A i Al>uiiVi'-ww^ * Aim * -ZlZl 
T G CAT GAX C CACT AT AT CX X T AAGT ACAG GGAT AAGT GCACX C GGAAAX C CAAAAGAATA 
GTTTTTAATAAAXCTATTXATCTGTGAAGAATCAAGCTGCGGACTAATAACATGACATTT 
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T GAT T GAATTTTTAAATCCTTAATATTTCCT CTATCAT GACGCGGGTTCATATTATGTAA 
AAGTACTACAACAOTOTAACCATTACATTTGGCAAATCTATTAAAAATTTTTQACGGTAA 
AGCATGAAAGAAAGAACTTATAGAATGACAT SAT CCCAATT GATT CATACATTCATCTAT 
TATAATACAGATAGATCCTTCACTTGCAGCTCTGCAGAATATATTATCTGGATTATCAAT 
ATTTAGATTAGTATCOOAAATAGCATCTTTGAAAGCTAATTGTATAAATTTTG<a\TTTAA 
TGTTTTTGTIAGTGGATTASAGAATGCATCGTAGTTTCCTTCAACACACTGTGCTTTCCA 
CGCAATTTTTTCTTCTAATGGAACAGTACCTTTTTCTGGAGTTAT GAAAAARATTGT TTC 
TGGTATTGGATCAATTAGTTTTCCAGATATAATATTTCTTATAAATTGAGATTTTCCGCT 
ACCT GT GGGT CCATATACAGTAACAAT GAATGGTTGTAAT CCGCAGTTTAAACTGGGTAT 
ACAGCCATCTTTTAACAGATTGTGAGCCT CATTT ACAGTTTTT T GATAATTTACAGCAAT 
ATTGTGTAAATCAGXCATAAGTTGACCATGATACATACATTTATCAAAAACTTCTTGACT 

ttctggaaatggarttctgcaaatagaaggatctatctttacaacatcatttttccaatt 
taatgtgtcacttaaaaattttcccaaaaagl^tottctgtcaatggttcttgcggtctt 
ggatttgggtgtctcttgtcgtacggctaaagtaagtatcctt t ct tccactggatccct 
ttcctcatcgtttgatccttccaaggtctcagaattctggttagttgcxtctctaccacc 
gtgaatggtacatcggttccacttgcggtttgcagtgtcttttttaaacttttcctcgat 
gtctgaaactctttctgtggttgttctaataaattatagtcagtaaaacaatgttttaga 
atttcatagtttaaacaatttttagcatgaccttt ggctcttaattttccttct ccaata 
aatttacagtttttacaagttatgtcttttaaagcatataatttaggacctaaaatacat 
gtttctgaactgaatgcrtcagctccgcaacggttac/iaacagtttcgcattcaaccaac 
caagttagacat ggat gttttt catcaaagattaaattt gagttatat tttttaagt cta 
tgtaatccttttgataacatgagttggtggcccttttctgttaagaataacgagtctgta 
t caccata^tactttttatctccctt t ctatgtaaggtttacccatatctt cc ccatat 
aaaatttct gc ccact cact catgaaagctct ggt ccaag c cagcacaaaggat g ctat c 
tgagttggatatcggttgttcttgatgcattcttccttatcctcaatagttgttaaaatt 
aaatcat t ac aat cagcagat aaaaaagt t ataggct t aaaagt c ac gtgat ct tgattt 
cctataaaaagtggaaaattaaaattttcatttgtgtctttggaatctttgggcggcatt 
tcaggtaggtttgaaaaatactgattccactcaaatgaacgttttggtaatgatttacta 

ATCACZAGTTGXGTATGATGTAATTTCAGCTGATCCATTTTCTAATCTTTTTTTATCTTTC 
TCTTCAATATTTTCAGCAAACACTACTTTCTTTTTATCTATACGGGTAGCAAACGftACCA 
TATAAAGCATTT GATAACAATTTACTTATACTTCGCT GAATCTTGTT GTTACTTTTACTT 
GCTT XT T CT T TAGCCATAAT ATTTACTTTCACATATTTT T GACATAAC GGTT TC CAGT CA 
CTCCATACAGCATACATTTCAGAGCTTTTGATTATTTTGCATTTGCATCCTCTATTGTGT 
AAGGTGATTAAATCGATAGAGGTCAGTACTTCATTTATCAATGTTTCATTTGACCAGCAT 
AACTTTC CACT T TTTTTAGAACATAATGGAGGT AACACAT CAAGAT AAT CTAAT GATGGG 
GGTTCACAATCGGCTACCACAATCATAGGTTXGATTGAATTGTCAAAATAATCTATTTTT 
TCTTTTCTTTGTAGTAGTTCTTGAAAGTAATCTATTTGTGCATTGGCTTCAAAAGCATTT 
AAAGT TTTT C CATAT GGAAG T GGATGC GT T AAGGC AC TAGCAT ACATT CC GCAGAT AT CA 
TACACM?ATATTGCTTCTTCAAATATTCCTAAAAATGAAGGATAACAT CTTCCTCCT CTT 
AAACT CATTCTAACAAAATCATACATTTTTTCTGATGGAGCTTCCAAATTTCTTAGGAAT 
TCAGAGGGATGATCTTCTTCATTATAAAAGATTTGTTTAAACAATGCTTGAGTATTACTA 
C TAATT GTAGGACGTTGGAATAT ATT AAAAGAACACTCAAGCTTTAAAGATGTT GTACAG 
AAC TCTT GAT AAC CT TCTATAAGTTTTT CAACTAATT GAGCC GTAACTATAACAT CATCA 
ATACAATACTCCTTAGCTTCCTCTAATAAGTTGTATrrTTGGTTGTGTTTTGGTTTGTTT 
TGTAAATATTCTTCAAATGAATTCCAATATTTTTGAACTGGATAACCATTGTTTTCTTTT 
TCATATTCTCCCAACATAAAAAAATCATTGATTSCCCTGTAAGGACAATAACCTTTCCTA 
ACACTCAACTGATATGCAGTAGCAGCGTCT CTTAAAGAAGAGT GGGTTAACAAAPATGTA 
TCCCTAACCMAAATTTTATACCTTGCCATTTCATATCTTCAAAAtTAATAATTCCATTT 
OTCCATCTTTCATAAGTTGTATGTGAAGGTTTCTTAAAGCAAGGATTTGGAAGAGATAAT 
GTAATATCATTAAWAACAGTTTTCCAGCACGAGGCATAAAGCTTCTTGTCAGCTTAAAC 
ATTGAAAGTTCTTCACTGTCTATTCCTTCTAATACATGACTTGCAAGTATGATTTCATCA 
AAACCACAGATATTATGACCTACTACATATAATTCAATATATCTTGGTTCGCACTGTTTT 
AATTTTTTTTCTTTATTTAAGACCATGATGTCTTCATATGArAAATTTGATTCAAGACCA 
TGATTTTCACAAAACGTTGACWGTATTTTTTAGCTACTGAAATTTGTAGCTCTGTTCTG 
AATTTTTTAAAAGCTAT GC CAATTT CAT CTT CT TTTTTATTTAACATTACAAAACATTCT 
CTGTTTACCTCATAACCTATATCGGTAGCTATTTTAGAAGCAATTTTTATGAGTGATTTA 
CAT CCAATT AACTTAAAAACCAACAAGTAAGGAGTTAACT GTTTTCCATACAAAGAATGG 
TAAGTATAT GTTT CAATATCATAAACAATAAAAAGACGTTTT GCTTTTAT G GCTCCAACT 
GGATTAAATTTGATTTTTTCCCACCAGAGTTTTGTTTCATGGTGAATATTGTGATAATAG 
AAGTCCCGTCTTCT GGATGAGCAGTTGTGTATATTACTATAAATTGTT CCGCAGAATTCA 
CATTTATTCTGTTGTTTAACAGTTTTTATTAAATATATTTCT CCTTTTAAAAT CAATAAT 
TCTATTGGTAACAAATTTCCATTAAGAATTTCTTCAGTC^TCTTAAAA^TCITTTGTTG 
AACTTCCATATTTTTAAA3ATACGGGGGTGTTAGAATCACAAAGTTTTAAAACATCTAAA 



ACATTTTCTACTTTCTTGAAAGAATTTAATTT TAAACCCTGAATT GCAAAGTAATTATAA 
AAACTTTTTTCAAAAT i ctt gtagtatataatttttatatatgtatcct CATATATT CCA 

GTAATATAAGTAGTAGTTCTTTGCTTTATTATTGTCTTTGAAGCCAXCTGTTTAAAGCCG 
C T T C C C GT ACT C GC ? C AAAGCT T CT T AAAAC AACT T CAT T T GT ACTATAG CCAACAATT C 
CAGACAATTTTATTCTAAATGCTATTTCAACTGAATCTAAATCTGAAAAATCCGTGTTTA 
CTTGGTTGATTACTTCTTCTATGCTCCCACTGTCTTCTACGAAGTCTATATCTTGAAGTA 
ATTGGTCTCTTTCTTCTGGAGTTGAAAAAGAGTAAGATCTTTCATTAGCTTCTATAATTC 
CTAAAAAATCACGAGTTATTCTGCTATATAGTTGTCTGAATGCTTGTGTTTCrCTATTAA 
AC CAAACTCTAGTAAAT ATA'TCTTCTCCATTTTCATTTCTACCTCTTAAT ATAATTTGAA 
CAAATTGGATTCCAATATTTCTGGCAGCTAACCTATTTTGCACTAAATTTAAGTATAAGT 
AATATAGCGTGCTTGCCACATGCTCTAATATAAAGAAATACACTAACCATTTTT'SAATAA 
AATCATCAGTCAATCTATTTTCATTATAAAATCTAATAAGTAATTGAAAAAATTCACTTC 
CGTAATTAAAAAAATTACTCCTTCTTGCTTCAGGAGTTAATTCTTCTTCTAAATrTTGAA 
TTAAAT CTACTATTGAAGCTATCACTTCATCATTAAATTCTTCCCTACT CAGAT CGCTTG 
AGCTCGGCTCGCGATCTGAAAATCCTTCATCTTCTATTTCAGGAACAGTAAGAGGAGAAC 
TAGAAGTTTCTTCAACATTCCTTACCCTTTGGCGTCTATTAACAGGTAATCTATCAATAA 
ATCTTCTGATTACATCACCCCTTGAACGTCTCATTATTTCAGTAATAGCTCTATAATTTT 
CCCTAGGTCTTAATCTGAAXGGTAATCCTACTCTTGTCCCTGACCTTAAAGTTAATGCTC 
CACCAT GCATCC CACCT TTTCCTAAAGTTAATACAGTTGCTAAATCTTTTAAATTAATTC 

gattttcagctt ctggaattt c cagctgtgaaaatt catctataaaaagctcaatccaga 
attcagaaaaaggtaagtctaatatacattcactaxtatgcatgttagacaaaattaaaa 
atttacataaagcttttttaattttacaaattaactttataaggtaagtatccctttctt 
gcaaatttaaaac cataaaagctt gagaaaaaggttgataatgctgct gaaaagatctat 

TCTGATTTTGAGCTGAAATAGCGGAGCCAAAACCTTaCATGTCTGCAAGTtGCAGACTCC 

ctaatattctatccattaaaaccgcgttttgaatttgactaattgtttgtgaa^aatttt 
ctacattttgaattgctctcatatatgacccagtatttatggagtatgaacaatcagtta 
aaatttgcc aggt catgcgtct ctc aaaacttataggtgaaagatacaacttata.tgaaa 
tgttgct gt aagtcc gct gatcaaacagatactggtttaaaactcgcgccacataaaaat 
acccaattaataaatttggt ggaggttctcctt caaatggtggtt gtgaagtaagaggtc 
ctctt gggcgt aaatc gagtaattgagtcactggataattaaaaaatcgattagcccatt 
ttattcccctttcatgtatagtccttgacctggcaatacttcgattattaaggl'caagtg 
ttaaacgtaaatatcgtaaggtatgttgactttgccc^gtgagttgttgccattggtgaa 
tctgcaaggcaaacaaaaaatttatcttattact gcagatgcatc ctattttacaaaatt 
tacgttcatcattggaaactccagacttatcaagcaactccccgggcacgtcaaataaaa 
atgaaaaagatgaatttgaaccagcagttggcatttctagcaaaccatctgaxgaattta 
atatgagacgatctcaaagagatgataatttacctaaaagtcagataccagtac5tagata 
tactacatgataaaaatcctaaaatggcagaafiaacgagacttaatgtataaatcttctg 
' cttgcataaaacttgatgattcxaaacaattaaaaactgatatgttcaggccggattttg 

ctggaactagtccagctcaaagacacatagaagccgca^^ 

atactcgtagtttajgaacaatggacacatgattcttttaiaagtcatgttaaacaattac 

tttctac^catttatatctctaggtattacat^^ 

tagatcatactgaatcgtcttctttaaactttcaactgtttactttaataaatcactgtt 

cagaaaat actt t aaaac ggattt taaaac acatt t ct aaaaaaaat gafcaaaaat caat 
atgtaaatcaatggttgattgatctcattacatgtatatatctaattataagagatgaac 
aaaatgttacagaacaagttaatgcccttttagraacxagtaarcacttagctrtacatt 
ttgcaaagaaagctacaggtgcattctatcct^ 

tttttttc^vagagaataattttaggaatactttcgctagcagaaagtataggttgctata 
ctotoaatccatattgcaaaaatcctttgaaaaagta^aaagtagaagtagaaccaagtg 
acgaaatgtatatgttcagctt?aaaggtgcacttgaacatcctgattccgac'oaagacg 
aagacagtggactt caaaatgaataattatcataaatggacttctaatgttatagatgca 
attctatcaaacaaagctcttttagctataaaaattttaaaagtcaaccgtttgcaaaca 
aatt gaatgctttagaatcagca^ttgtgcctccaagaaaagatgatactcctgaaat ga 
tagcaaatcttttaaaagaattacttgctttgggagctattcgcagtgatgaagttggcc 

cat tat att ct gaccttcttatcagagtt c acaaat ataatagctt gaatgttcaat caa 
atttgcaaactttaacaggagacattaaatcacttcaatccgatataattagaagttccg 
atattccouattaagtaatcaagttgttttaaatacatttttaaattctttgccctcaa 
ctgttacatttggacaacataattatgaagcttttaaacaaactctaagattatttgtta 
atgagacacctaatattac^gtttttagatcaggaaatgatactttaattcaggttaaca 

taacaggaattcatac^ttaatttgaatgatgcatttaa 
gaatagtattaacaggtgaatttattc^ggtgatattacaagcagactaacagctaata 
caagagtactgctttattttcttgctccttttacaaatgataatacattcac?*cctgata 
cttttctagctttactcatgaaattatatagattgacagtttcttctgctttagattttg 

AAGAAGAAACTGAAGCTGAAGTAGftAAAXGTAGCTCAACAAATAGGATCC 
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ACTAGTGCAGAJTTTACAAAGACTTTAGGAl'ATCTATTAAAAAACAMGAAGAATCATTT 
TCGCCTCCCAAATGATTATCTCCTAGAC^CTGGGTATTTTAAGGTTCATACAQAAAAGT 
CTGGTAGATAAAATTGATAGAAATAATGAAGATCCATGGGATGCTTTAGAAACTTTATCT 
TATTCATTTTCTCCGTCATTTTATGAGGCCAATGGGCCTTTTATTAGACGGTTAATAACT 
TAT AT G GAATTTGC CTTACGTAATTCTCCT ACTTACTT CAGAGAAATTTACTCCAACAAA 
TATTGGATACCACCCAATTCATTTTGGACTCAAAATTATGCAGACXTTTTTTCGGAAAA« 

aaagaaaaacaaaatttcgaaacatttgaaccgcgggaacttcctttacaaatctctgag 

gaagaagctgtcccgcatacagaagattttcagtcagccatctcgccctctatgggccaa 

acttcactccctgctccttctgtgtcagaatacagtagcgtgcctcggtcagctttttac 

cctctcagagaacgtatccaagagagcatttcaaaggcagtcatccctcctttgacaggc 

tatgtcggaaaacaaataggtgaaactattttccctggtagtggagatcttgtagcaccc 

gctgcotctttagttgcagcac^ttggttcattcaaggtttaataacagaa<^caaasa 

ttgaaagacgcagcc^gaaagcctcaccgctatgttagagagatgcat^ 

aaagagtcaaat gcttctaatgatagggtaatatcacctttgattggacatggttcgcgc 

actgaaaatcgttttgaatatttgagacctaaaggtggaaattatttatactaataaaaa 

t cataacagacct gacgggcggt catccttttttattagatgcagaaatttgtacct cca 

ccac gaat c ctt gct ccaacagagggtagaaacagtattactt at acgcctct g g cacca 

ctgcaagatacaagaaaagtattctttattgacaataagtcttcgwcattgaaagttta 

aactttactaataatcacagtaacttttttacaaatattattcaaaatgctgaxttggca 

g c g gat gaagcag caacgcaaga7at taaactggat gaaagat ctagat ggggc ggt gaa 

ctgaaaacttttataaaaacaaattgccccaatgtttcagaattttttaacagtaatagc 

ttt ctagccagattaatggtagataaaactgatc cagaacatcctaaatacciaatgggta 

CAAATTACAATTCCTGAAGGCAATTACACTGGAAGCGAACTTATAGATCAACTTAACAAT 
GGTATT T T AAACAAT T ACTTAGAAGT G GGACGC CJVAAAAGGAGTAGAAATT GAAGACATA 
GGAGTAAAATTTGATA<^GAGATTTTTCACTTGGATATGATCCTGAAACGGGACTAATT 
ACTCCAGGAAAATATACATATAAAGCTTTTCATCCAGATATTATCTTGCTACCTGAATGT 
G G C GT AGAT T T T ACAT ATT C T AGAATTAAT AAT ATGTTAG GTAT AAGAAAGAGAT T T C C A 
TATACTAAAGGATTTCAAATTTTATACAGTGATTTGACGAAGGGAAATATCTCTCCATTA 
CTGAATTTAAATAACTATCCTCATTCTATCGAACCTGTAATGCAASACGAAAATGGAGTT 
AGCTATAATGTAGAAAAAATAAGT GACAATC CCCCCAGATGGCAAACAAAGTACAGATCT 
T GG ACTTTAAGTTATAAAAAT AATGGAG GAG CTAAAGCCCTAACT GTACTAACT GTT CCG 
GaC^TAACAGGAGGATTAGGTCAAATTTATTGGTCAATGCCAGATACTTTTAAAGCACCT 
ATTACTTTTACTAACAATACTACAAAGCCAGAAACACTTCCAATTGTTGGATTACATATG 
TTTCCTTTAAAAGCAGGGTTAGTTCATAATATAAATGCGOITTTATTCTCAACTTTTGGAA 
CAAATTACAAATACAACTCAAGTATTCAATAGATTTCCTAAAAATGCTATACTAATGCyVA 
CCACCTTACAGCACCGTAACATGGATAAGTGIAAAATGTCCCCTTTGtTGCAGATCACGGG 
ATTCAGCCATTAAAAAACAGCCTTACAGGTGTACAAAGAGTTACTATAACAGACGACAGA 
AGGAGATCTTGTCCATACATACAGAAATCTTTGGCGACTGTTGTCCCTAAAGTACTTTCA 
AOTTGCTACACTTCAGTAACAATCTGGCTGATATCTCTGGGCCTTATCCTCCTG'GAACCGT 
TATGTCTATTTTAGTTAGTCCCTCTGATAATACCGGGTGGGGTATTGGAACATCAAGTAT 
GAGGGCTACTGGCTTGAAATTTtTCTAAAAAACAACCTGTTAGAGTGCGACCTTATTACAG 
AGCT CAGT GGGGACAGCTTAAT GCT C GTACTTCACTT GAGAAACTAAAAACCAAATT GAA 
ATATTATGAAAAATTGTAGAGGGACAGACTAAAAAGAAAAACAGTTGTTCCAAAGAAAAA 
GAGGT CACCTACAT CTCCTGCGGAX CGACTTAAAAAATATCTTAAAGCTGTCAGTCAAAT 
CAAAGCTTTCAATAGAGCTAGAAGAGCAGC CCAATAAATATTAT TTTT CACTT GCAGATG 
AAGGTAGTTCACGTGCTTAAATCTCCTCMCGTCGAAC^^ 

ctaaaaaaaat caatctat ctccatacattttacctaaagaattggaaggcggtttttta 

ccagctctcattcctatcatagcagccgcaattagcgcagcccct^ 

gtaatagctgctaaaaatgctaatcgttcttaaaatttagaaaactttttttttaacaga 

tcacatggctttttcaagattagctccccattgcggcttaacacctgtttatggccacac 

cgttggaatctgtgatatgagaggaggtttcagctggtctagtttgggaaattcttttac 

ttctggtttaagaaacata)ggttcatttatatcaaatactgctcaaaaaataggtcaatc 

ACAAGGATTTCAGCAAGCCAAACAAGGTCTACTGCAATCAAAT gttttagaaaat gcagg 

acaattagcaggtcaaactitaaatactttggtagatattggaagattaaaggtagagaa 

agatctagaaaaattgaaacaaaaagttatagggaacgaccw^ 

attagctcaactaatagcc^cttaaaaccaaaagat^ 

aaaaattgttgaacctatgagaccagaaattaaatctagccaaatgcctgtagaaatgtc 

TTTTTATGATTCTGTAAGT gatgaaccaat cataaaaaccaaagaagttagccctccttc 

attttcatctgaatcttcacattcatattctcacccaagaaaaa3aaaacgcgtatccgg 

ttggggtgcatttttggataacatgactggagatggagtaaattttaatacaagaagata 

TTGTTATTAAAAAC^CTTTTTATTTACAGATGGAGCCACAGCGTGAATTTTTTCACATTG 

cgggtagaaatgcaagggaatacttgtctgaaaatctggtacaattcatctctgccactc 
aaagtttttttaatcttggagaaaaatttagagatccttttgtagctccatcgacgggtg 
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TAACTACTGACCGTTCTCAGAAACTTCAACTTCGTATA&TTCCGATTCAAACXGAGGACA 

AT<3AAAACTTTTACAAAACTAGATTTACTTTAAATGTAGGA<5ATAACAGAGTTGCAGATC 

TTGGAAGTGCATATTTTCACTVPTGAAGt^GTTATTGATAGAGGACCTACTTTTAi^ACCTT 

ATGOAGGGACAGCTTATAATCCATTAGCCCCAAAATCAGCTTTTCCCAATGCAGCTTTTA 

TGGATACTGATGAAGCTACAACAATTTATATTGCTCAACTCCCTAATGCTTATAATGCTC 

AAAACAAAGGTGTAGAAGAAC3CAATTCGAGTAGAAGCAAACACTACTACTCCTAATCCTC 

AATCAGGAGAATATGCTACTTAT GACTCTGCGAAATTTAATCCAGAAACTACTG GTGCTT 

C7GGAAGGCTTTTAGGAATTAATAGCTTAGGAGATCTTTTTCCGGCTTATGGATCTTATT 

GTAGACCTCAATCAGCAGATGGTAACATTTCAACTGCACCCATAACTAAAGTCTATCTAA 

ACACTACtGCTACAGATGACAGGGTCAGTGGAGTTACTGCAGTTGACACCGCAACCAGAT 

TGCATCC AGATG CT CATTATATTGAATATACTGATGAAGCCAAAGCTACAGCTATAGGAA 

ATCGCCCAAATTATATTGGTTTCCGAGACAATTTTArTGGACTCATGTTCTACAATAATG 

GTTCTAATGCAGGAACATTTTCCAGC caaacacaacaacttaatgtt GTTTTAGACTT GA 

AT GACAGAAACAGT GAACTAAGCTAT CAATAT CTAATAGCAGA? CTGACAGATAGGTATA 

GATATTTTtKTACTTTGGAACCAAGCAGraGATAGTTACG^CAGTATGTCAGAATTTTTC 

ATAATGAAGGATATGAAGAACCCCCTCCGGCCTTATCATTTCCTTCTCAAGGTATCCAA 

AATTATTTCATGCCTACTGCGGCAGGTAATGCGATGACAGTAGACACGGGTAGAAATACT 

GCAGCAAAAACAGATAACACCAAGGCTTTTATAGGATATGGCAACATGCCATCXTTGGAA 

AT GAATCTGACAGCAAATCT ACAACGTACATTTTTGT GGTCTAATGTAGCAATCTATCTG 

CCAGATAGGCTGAAAACAACACCACCCAACATAAATCTACCTGATGACACCAACTCTTAC 

GGATATATAAATGGAAGGGTCCCTCTAGCAAAGATAATAGATACATGGACTAACATTGG^ 

gctaggt^ggtcattagatgttatggatactgtaaatccatxtaatcaccacagaaattca 

GGACTAAA<3JTATAGGTCACAACTGTTAGGAAATGGAAGATATTGCAGATTTCACATTCAA 
GTACCTCAAAAATTTTTTCCTATAAAAAATCTTTTGTTGCTGCCAGGAACATATAATTAT 
GAAT GGTACTTT AGAAA 

"GGATCCCAACATGGTTTTTCAGTCTACTTTAGGTAACGACCTTAGAGCAGATGGCGCAAC 
TATTACATACACCAACATAAATTTATATGTTCGATTTTTCCCTATGAATTATGAAACAGT 
AAGTGAACTTGAATTGATGTTGGGTAATGCTACTAATGATCAAAACTTTGCAGATTATTT 
GGGTCCGGTAACTAATCTTTAT CAAATCCCAGCTAATACAAATACT GTAGTAGTGAACGT 
ACCAGATAGATCTTGGG GTGCTTTCAGAGGATGGAGTTT CAATAGAATTAAAGCTTCAGA 
AACACCTATGATAGGAGCAACAAAAGAJCCAAATTTTACTTATTCAGGAXCTATACCGCT 
ACTAGAT GGTACTTTCTATTTAACACACACT TTTCAACGAGTTTCTATTCAGT GGGATTC 
TAGCGTTCCATGGCCAGGAGATGATAGGCTTXTGATTCCAAATTGGTTTCy^AATTAAGAG 
A<^CCTAATATGGACGCAGAAGGTTATACTATGAGTCAAAGTACTATCAC^AAAaATT^ 
TTATTTGGTACAAATGCCTOCTAATTATAATCAAGC'rTATCAAGGTTATAAATTGCCAGT 
ACATTCTAAATATTATGGATTTTOAGAAAATTTTCAACCTATGAGTCGCCAAGTACCAAT 
TTATGGTAATGGCACTTATGATTTATATACTGCTTATAITACAAACCAAAGAACCATGCA 
AATTTGGAATAATAGTGGTTTAGAATCTAAAACTTCAAATCCTCCTATGTTATCCAACAC 
T GGTCATCTTTAT GTAGCTAACT GGCCATACC CTT T GATTGGAC CAAAT GCTATT GAAAA 
CCAAC^AACTGAAAGGAAATTTTTGTGTGATAAGTAa'ATGTGGCAGATACCATTTTCTAG 
TAAiTTTTTGAATAT.GGGTAATTTAACAGATTTAGGGCAAAGTGTTTTGTACACTAATTC 
TAGTCATTCACTTAATATGOTTTTTACT GTGGATAGTAT GCCTGAAACAACTTATCTAAT 
GCTTTTATTTGGTGTTTTCGACCAAGTTGTTATTAATCAACGAACAAGAAGTGGAATAAG 
TGTAGCTTATTTGCGCCTTCCTTTTTCAGCTGGTAGTGCAGCAACATGAGCGGCACATCC 

gaaagtgagctgaaaaatctgatttcatcattacatttaaataatggatttttgggcatt 

TTTGATT G CAGATTT CCAGGT TTTCTGCAAAAAT CTAAAAT T CAAACT GCTATT ATT AAT 
ACAGGTCCCAGAGfrACAAGC^GGAATACACTGGATAACAT 

T ATAAGCTATTTATATTT GAT CCACTC GGATGGAAAG ACACT CAATT AATTAAAT T TTAT 
AATTTTTCACTAAATTCTCTTATTAAAAGGTCGGCCTTAAATAACTCAGACAGATGTATT 
ACAGTAGAAAGAAATACTCAAAGTGTTCAAXGTACCTGTGCGGGATCGTGCGGCTTGTTT 
T GTATATTTTT CTTAT ACTGTTTT CACTTTTATAAACAAAATGTATTTAAAAGTTGGCTT 
TTTCAAAAATTAAACGGTTCAACCCCTTCTCTGATCCCATGTGAACCACATCTATTACAT 
GAAAACCAGACATTTCTTTATGATTTTTTAAATGCAAAAAGTGTTTATTTTCGAAAAAAT 
T AT AGAACAT TT ATT GAAAAT ACT AAGACT GGATT AATAAAAACACATTAATTGTATTCT 
TGCTTTTTGACGTTTTCATTAGTCTTCATCTTCATCTTCTTCTTCACTGCTAGATTCCAA 

gatggttttttttttctttgatggagtaggctcttcaatagttccaaaaggattcatatc 
agaatcct cttctatg ftaggcaacatagtatt tttaacctggaat gact gattccactt 
aaattgagaaaactgaattggaatgttatttcccatac^^ 

AAGAGTTAAACACTGTAACATATCT<3GCAAGCTAATTTTCATCTCACAAAA , TTTTCCATT 
ATTACGTCTCAAGTT GTAT TGATAGTTACAACATTGAAACACAAAAACAGCAGGGAAT GT 
AACTGCTGCGGCCTGAACTCTATTAACATCCTGAAC^TCAATTCCTTCCACTCCAGATAT 
AGAAAATGGAGTTATTTTAGGGAGTTGTTTTCCTATTGTTTGTTTGCCACCATAATTACA 
TTCACACTGACCCAATATAAAAAGCATATTTCCGACTT'TAGCrrTTCGGAAACACAGCTTT 
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tgtagtttcaatggcattttgcatagccag(^aggccttcttttcatctgaaaagttaafi 
accacaactgcgaggagaacattgcccaaaacgctgatgggcatcctcagcacataacac 
gtaatgttccttsaactatttttactacttgtttattcatacgcccattactaagaacacc 

CCTCCCTTCCTTTAGGGCTTGCACCCCTGCTTCCGATGTTGGAGGCATTTCAATTTCATT 

cacccttttaaacatgaagt caccat gaaaagatctaggacggt cct c CTCCCAATCATG 

ATACCACAAATAACAACC AGAAGCAT TAAAGTTTGGAA2 CAAGTCAATTTGCTT ACAAAT 

tGCACTATATAGCATTCTACCTCCTACAGTAGCCATAGATTTACTGCTACTATAAGTCAA 

ATTTATAATTTTCATCTTTTTCATGTACTGAGCAAATAATTT'TPCACAATCTCCTTCTTC 

AGGAT GAAACTTCATT TGACTGGTATCAACTTTAAGACACT CTCCAAATTTAGC TAAAAT 

TTCGAGCGCCGCTTGAACTTTATTCTGAAATtCTTCTGTAGTAGATTTTCTCTTCTTGAT 

AGATTTAGTAACTTTTTTAGAAGACATTATGTTAGTTTTTTTCTCGTTGTAGGATGGCTG 

AAAAAAATAT GG GAGAGT CAGAGAAGG GT TTGAAC GAAGAAGAATTTAACT CTATT CT AT 

CAAAACATCT GGAAAGACAAAT T AAAAT CT GT AAAGCGTTAACAT CAAAATTAT C GAACT 

G GAAT ATT G GAACAT T GT T AGflAAACT T GTT ATTTT GT CC T GAT GAAAGACAAT CAT GAG 

GTGATCCCGACCCAAAACTAAACTTTTATCCGCCTTTTTTAATTCCGGAATGTCTTGCAtf 

T GCACTATCCATTTTTTCTAACAACTC CTATTCCGCTATCATGCAAAGCGAACAAAATAG 

GAACTAACACTTACCGAAAATGGATGAACAATCAAGTCCTGGATTTACAAATACCTTCCT 

TGGAAAATTGCAAATGGGATGATAGCTTGGGAAATGTAGATTTAATTGAAGAGCTTAAAG 

AGAACCAAAAACTTGTTTTAGTAAAACAAGACCATGAAAGAAATATATGGTTTAAATCAA 

AATGCAAACAACTTCAAAGTTTCAGCTATCCCTCACTCAGTCTGCCCCCAGTTTTACAAC 

AAGTTTTAATTGAATCTCTTAT CGGCATTAGTCAGGAT CCTAATAACTTTGACAAAAATT 

ACGAACCTGCAATAACTCTAGAAAAACTACAACATGTAAACTGTGATCAAGATTTAAAAC 

AAGTTCAACAAAAAGTATCTT CAGC C GCTACATACGGAATACTTTTGAAATGCATT CA*3A 

CTTTATTCAGTGACA^ATTATTCATTCAAAACTGCCAGGAATCATtACAT 

ACCATGGTTATGTAAAATTACTTCAATTTTTGACAAATGTCAGTTTAAGCGAATTTGTAA 

CTT T CCAT GGTT T AACACACAGGAAC AGACT CAAT AAT C CGC AGCAACAT ACAC AAT T GG 

CAACCGAAGACAAAATAGACTATATCATAGATACAGTGTATTTATTTTTGGTATTTACGT 

G GCAGACAGCAATGGATATTTGGAATC AAACATT AGAT GATAAAACAAT AAATATAAT TA 

AAGAGGAAT T AAAC CAAAATTTT GAGAAAATTGTC AAAG CT GAATCAGTT GATGAAGTTT 

CTGAAATTTTAAAGTCTATTATTTTCCCTGAACTCATGCTGCGAGCTTTTTGTTCTAATT 

TACCTGATTTTATAPATCAGAGTCAGATATCAW^TTTTAGAAACTTTATCTGCATTAAAT 

CCGGCATACCGCAGTCAATTTGCCCCCTATTACCTTCAGAXCTAATTCCTTTAACTTTCC 

TAGAAAGTCATCCAATACTCTGGAGTCATGTAATGTTACTAAATCTTGCTTCMTTCTAG 

TAAACCAAGGCAATTATTTGCATGAACCCGAAAAACCTTTAAATATTTCATCAGTTTACT 

GTAATTGTAATTTATGCTCTCCGCAAAGAATGCCATGrTACAATAGCAGTTTGATGCAAG 

AAAT ACT AACC ATTGATAAAT TC GAGTT GAC AAACT CT GATAAAAC AAAACAG CTAAAAC 

T GACC CT C CAAACTT TT GCTAAT GCCTAT CT TAACAAATTTAACT CAGC AGAAT T CTACC 
ATGACCAAGTTTTATTCTACAAAAACTGTAAAAGTAAATTTTCTAACCAATTA^CAGCTT 

GT GT AAT AAAAG ACGAAAAAT TATTGGCT AAAATAGC AGAAAT T CAAATAAC GC GGG AAA 
AAGAACTCTTAAAAAGAGGAAAAGGAATTTATTTGGATCCAGAAACAGGAGAAATCTTAA 
AC AAT GGAGAAGCCAT AT CAT CCT CT GAAAACTTCCAAAGGCAAAGAACTAGCT ATGCTC 
T AC CAT C AftAT GAAGGAGAGC GAGCTGGATG G GAAGCCGATGAGCGAAGAAGAC GAAGGA 
GAAGTGAGTGAGGATGAAACAGAGACAACAATTCCAAAGAAAM 

TAAGCTCTAAATTTTTTATATTAAAAACTGAATTTTTTTAGACAAAMTATTTTAAATTA 

AAT C TT TATAGCTAGCAGTT GAT CTT T GTT CGTTTTT CAGAAAACTCAAGTGTT CAGTC 
ATATCAAGTTCACTTGCCTCTGAAACACGAAATTGCGGAAATTCTAGAAAAAATTAGACT 

AGAATC TAAAAAATAT CCAGGAAAAGTTTAT CAAAT AAGAAATAGAACT CCAGCAAGTAT 
T ACAAAAC GATACCT GT ATGAAAGAGAT CT GAAGAAACT GTT CCAGTATCTAG AAGACGC 
AAAGAAGCTTTACGCTAAGTAC CAAAGCT GAGCCTTTATAGTTTTAAATTTT C C C G C CAT 
GGCTCAACCAGTGAC GCCTTACGTCT GGAAATACCAACCAGAAACAGGATATACTGCTGs» 
AGCCCATCAAAATTATAACACTGTTATCAACTGGTTGCATGCCAATCCACAAATGTTTGC 

CAGAATTCAACATATAAACACCGCACGCAAOT 

CCGAGATGACATCGCGGTTAACATCAACAACTGGCCTGCAGAGGATTTAATGCAACCTCC 
TAATTTTCCTTACATTCCTGCC^CCTCTAAATCCGCTTCAACCATAAATGACTGGTTGGC 

T AC CAC T CAAGGAATTCAACTC AGTG GAACT AGT GAACTAAACG GGT GGGGAT CT AAC CG 
CCTGACTTCCTATCCGGATAtTCCACCCATTTTAAAGTATGAAAGGCCTGGT^^CT 

T CAAGG C CAAGGACTTTTT AAG CAAGAAAAT ATT CATTTATT TT AC GAATC T CCGCGC CT 
CCCTCGCTCTGGAGGATTAACTCCCCAACAATTTGTAAAAGAATTTCCGCCTGTTGTTTA 
T AATAACCCCTT CTCAGAAT CTATGAGTGTATTTCCGAAAGAATT'T AGTCCTTT GTTTAA 
CCCTT CAGAATCTTTGAAAAAAACAT GCAGTCAAACTTTACAATATAAATAAAAAACTTC 
TATT GAT CTTTATACT TACACTAAAGC ATC GCGT T TATTTT CGT C ^^^^^ ^ 

CAAAGAC C C GT AATTCT CT AACT TTAAAT CATTTTTT GAACTAAT ^^^^I^^^^h* 
TGTAGGAATTAATATATCAGAAACCAGTAACAAGCCAGAATTAAAATATACTTGTGTCAT 



TTTTACAGAT GAAGC GAGCACGCTGGGAC CC S CTTTATC C CTTTTCT GAAGAGAi^ACT GG 

TTCCTCTGCCTCCTTTTATTGAAGCC<3GAAAAGGGCTAAAAAGCGAAGGGTTGATCTTAT 

CTTTAAACTTTACTGATCCTATCACTATAAATCAAACCGGTTTCTTAACTGTAAAATTGG 

GAGATGGAATATTCATAAACGGAGAGGGT GGCCTATCAAG CACTGCTCCAAAAGTCAAAG 

TTCCCCTGACTGTCTCAGATGAAACATTGCAACTGCTATTAAGTAATTCTCTAACAACTG 

AGTCAGACT CT TTAGCTTTAAAACAACCGCAACTTCC CCTAAAAATAAATGAT^GGGGA 

GTTTAGTAT T GAACT TAAAT ACT CC TTTAAAT CT ACAAAAT GAGAGAT T GAGTT TAAAT G 

TTTCAAATCCACTAAAGATAGCGGCAGATTCTTTAACTATAAACTTAAAGGAACCCCTAG 

GATTGCAAAATGAAAGTTTGGGCTTAAATCTAAGT GATCCTATGAATATAACTC C AGAAG 

GAAATTTAGGTATTAAATTGAAAAATCCTATGA^GTTGAAGAAAGTTCTTTAGCCTTAA 

ACTATAAGAATCCTCXCGCCATTAGTAATGATGCGTTAAGTATAAACATTGCGAATCCAT 

TAACTGTTAATACAAGCGGATCTCTAGGAATATCTTATTCTACTCCCTTACGAATTTCAA 

ATAATGCTTTATCATTATTTATAGGAAAACCTTTAG<^TTAGGAACTGACGGC'rCTTTAA 

CtGTAAATTTAACTAGGCCTCTGGTATGTCGTCAGAACACTTTGGCCATAAACTACTCAG 

C CCCACTAGTGTCAT TGCAAGACAAT CTTACTTTAAGTTATGCTCAACCATTAACTGTAA 

G C GAT AATT CTTT AAGAT T GT C T CT AAAT T CT C CACTAAACAC AAATAGT GAT GGAAAAC 

TTAGTGTAAACTATTCTAATCCTTTAGTTGTGACTGACTCTAATCTTACCCTCAGTGTTA 

AAAAAC CT GT AAT GATTAACAACACAGGT AAT GTT GACTTAAGCT T TACAGCT C C CATAA 

AAT TAAAT GAT GCAGAACAGT T GACT TTAGAAAC CACT GAGC CCTTG GAAGT GGC C GAT A 

ACGCTCTAAAACTGAAACTTGGAAAAGGCTTAACTGTTAGTAATAATGCTTTAACCTTAA 

ACCTTGGAAACGGTTTGACTTTCCAACAAjGGTCTTTTACAAATTAAAACTAATAGCTCTC 

TAGGGTTTAATGCTTCTGGGGAATTATCAACAGCTACAAACCAGGGAACCATAACCCTTA 

ACTTTCTAAGCACAACTCCTATAOCTTTTGGGTGGGAAATAATACCTACTACTGTAGCTT 

TCATTTATATTTTATCAGGAACACAATTTACTCCTCAATCCCCAGTAACTTCTTTAGCTT 

TTCAACCCCCACAAGACTTTTTCOATTTCTTCGTTTTAAGTCCGTTTGTTACATCTGTAA 

CTCAAATTGTGGGAAATGATGTTAAGGTTATTGGCCTAACTATTTCTAAAAACCAATCTA 

CCATAACTATGAAATTTACTTCTCCCTTAGCTGAAAATGTAGCAGTTAGTATGTTTACAG 

CACAT CAATT GAGACAAT GAATATTTTAAAAAT TCTT TATTAAAGAGTAAT CTTTTTACA 

7 AC C GTTCH'TGACATAATGT GC CT CTATAPiTTAACA^TCTA?\GCAAGCAAGGTTGAT CA 

TTG GAAT CTATAGAAGCATAACT CTTCCAATAAG CATAATCATAT GG C GGTAAATGAAAA 

C CC C TT AAAT CT AC CAT ATT CAT CTTTAAGT GT ACAGTAT CTAACAG GTTTTT ACAAT CT 

TGCACTTCT GGACTTTTAAAAACAAACAGTACTTTCMACSGACAACAATTGTAACGGTTA 

TAATCTGTTACAATTTTACTTATTTCTTCTTCCAAT GGCAAAGCATT CCAAAGT CTTGTT 

AT AAGT ACT GTAAAAT CAT CAAAT GAAT AACAT AACAC AT T T GTACAACAAT TGGTC CAA 

GGTAAAAAAACAGGCACACGAACATGAACTTTTTTTAAAftTTAACATCAGTGTCTGTTTT 

AAACTTTSACATTGCAAAGAATTTGGCTGCAAGCAATGACAATGAAATTGATTTTGCTGA 

CAAGGTAAG T CACACAAAT ACAACT T T AACAGC CT AAAT ATAAC AACAT T AAT GTAACT T 

TCCAAGACTTTAAAACTAACAAACGGTATATCACAATAAAAAAGATtSATGAATCCCTTCG 

CAACACAT AAT G GAGTT CAT GCTACATCCAAAGATGGTT C C GACAAAC CTCTGTAAATTA 

AAGAATAACAAT ACAACATACGAAGAAAATT AAAACGTTTTT CAAAAC GAGATAT ACATT 

GCTGCAAAGTAT CTGAACATT TACATTTTATACTTAT AAGCTCACAAGTT T CAGAAAAT G 

TAATTCGTTTAACAGTTTGATATGAATACCATTTTGAAGAAAAAT 

CAXCTTCCATCACTCCAGAAAATAAAAAAT 

agaaaQagttttgtg 

catttgtG^gctcccagaaacattaacggac^gcaaa 
aact^tcttaacgtttcgttcagaaaacaaagtaac^^ 

TAAAACACTTTTGGCAGCTAAACATTGCAAAlGATCC^ 

ataaaacttataagccatat cggccctcttgcaaaacgaatcagctttttggcttatag g 

aaaataacaaaaaaactgattatatatgaatggagttaatatct 

acgaatagcagaaccaagacgaccacgcccaacac^ggtaaatatttcaa 

aggaacagatggtttctcacaagcaacaactttgaittgcttatc 

ggcttafttaggaaaagaagaaaaataattttcccaata^ 

TCATCCTGTACATTACTAGTCACAAATAC^CCT^ 

aaaactcccaccaaattgtcccagtctacctgaaaaaagccagttccc^tattttcaaaa 
t tt gc ccattt taaataat c caaagcatcaaattcaggaaacaaat ctttct gagctaaa 
aca?atacaottttatcgccattaaatctaaaagccatcctaa^tggmctctagcccag 

TAGTTTAAGTACCGGGAAGAGACTATACAATATACTTGATATTGATGTCTGTTAAGTGGT 

GATAAAAAAGAAAGTAATTCAC5AATTAGGATAAAGCATTCTCCCATGTTGATTCATCTAC 

AAAAAACAAAAAAATTATAAGGTTCATAGAAAACCTACTATTTAACAAATCTA 

G C ATTAAAAAGTT AC CT T GAAT ATAAATT CAGAT CACCTAAAAAAC GAAAAAAAAT AACA 

TTTATCTTAGTAAATGATAGTCTTTAAAAATTAGAAAAGAATCAAGTCGCTTTtPATACTT 

ACAAACTCCAAATAAAT T CT GT AAC CAAGAGAAAAATT GTAAC CTAAAAGGTAAAGAAGA 



a 



ACATTATAAGATTAAAAC CACTCTAAAATCT GAAAAG CATT AT GAAAAATT CT GAT AGCT 
GCAACTTACTACTCTTCTCCAAATGTTGCAGGCATTTCAAAAAATCAAGAGGAAAACCGG 
AOTTFATAAAGTAGTAGT CTGATTATATCT GAAAAAGTTTAACTT CCTTTTCAACCCAAC 
CCAGTCCAATAAAATTCCAACCTTAACTTCTTTCCTGCTAAAACTCCATAAMGTCCAAT 
TAC CACTTGACTTTTATTTAACCTCAATTATGTTAGAT GTTATTCTACCCATAAAAACTT 
GATGACCAAGAACTC^CCTTTCCCATGTTTTTCTGAAATAACAAAAATGTTGATTTAAAS 
ATTTTTAACTACCCAAAAAACCCGCTCTCATGATTTTTTCTTATATAAACAGGATACAAA 
AGAACTGGCAAAGATATTCCATC^TACTTCTCCAACTGT 

TCCCATGTTTITTCCCTTTTGCACAAACAGGATATAAAAAATA^TTTTGCCACAATGTTT 

TTCCTTTTACTCAACTGCCAGAATAAAAATGAACAGCTTAACCTTTTTCCCTCTTAACCC 

ATTGCGTTCCTCTAAGAAAAAAATTATCCCGCCCAATATGCTAAAGGCTTCTCCCGCCAA 

AACAGCTCAACTT AAAAT CT CT CAt GAATAAAACC CAGAGAAAATTTCCAGT AA'PAAAAA 

TTAAT AACCGT GAAGT ACT AGAT C? AAT AAT GATATTTT G AACT C ATAAAAAT C CAC CAT 

CCATGTAAT GTTACAAACACTTTTTTATT GAGTTTTTTCTTACAACTGCATTACATACAG 

GCCAAGCATCAAACTTTCTTCTGTATTTCTTCCTAGACCACAAAATTACAGACTTATATT 

T CTGCCACAAATCT CTATGATCTTTACAGTAACACTTACATTTAAATGGGGAATACAGCA 

G CAAAT AAGGATGAGTT AAACAT GC GAT ACAATGAC CAGAAGGAAGATAATACAATACAT 

CACACCAAAAT GAAG GTACAGACAACATCGCATGAAAT CTTAAAT GTGATTTTAC AAT AA 

ATTTCTGCAGCAGCTTACAATCTATAOTAGC1AAACCGTTTTATATACAAACATAAAAACT 

TGGAACTTTTCACCAACTC^TCATGTTATTATAACACATTACAAATTTTGCTATATCTT 

TATTTGTCAAATAACAAAATATCTCAATC CACAGCTCATCTGQCIAjGCAAACTTC GGAAAT 

CCATGACCTGTAAAAGATACAACAGAAAACAGAAAATTAATGCCATTCAATAACATAAAA 

AATACAGTCAAATCACATACTTTTTCTCACT^^ 

CAAACTTC^GAAAATGGATGCATACAAGAACATTCTCCTCTCAAAAA 

ATGCGGCATTTTGCACCTCCAGAAAAATGCAGTCCATTGAGAGGCTCTTCTCTT.AAAACA 

CAGAAATGCTTCTGCAAAATCTGTAAAGAAACTAACAACTTCCAAATTCCAATCATCATG 

CATTGCAAAGAAGGACATTCAACAGCAAAAGGATCGTGATGAGCCAAT^ 

TATWCTC^TTTTCATGAATTACAGTCTGTAACTTACTATAArGCATTTTAAGCrrCTGCT 

T CACAAATTAATAATGCTAATTT CTTTAAGCAGCTCAAAGAAAACTCATCAGGACAACGG 

CATTTAAGAAAGCAACAAAATGATTTCTTAAAATACATTTTTCCAGCATGATGAACAATA 

AAAAATTT CAACGTTAAACAATGCAAAAAT GCATTTTTA2 ^ T 

T CAGCTGAAGCTAAATCACAGCCTATTTTATTACAT GATTTTGTATGCT CCAAAAGAGCT 

TGTTTTAATTGCTTCAAATCCATCTTCTTACAATTTTTTCTTTTTATAAACACCAGAACC 

GCATT CAGGC CAAT T C CAGTTAT T GTTTAAAT T TGCTACAGAAACT GCAGACC ACAAAAC 

CACATCCT CTAAAT CAAGCCACAAAGATCTATGATCCACACAAAAACACAAAGAATGATA 

cggagaatacaacaataaatggg gatt aacaagggac gcaacacaai gacccgaaggt aa 
taaagttttacagcaccaattacaagcaacaggtaatggagtatatttcccaatgcgacg 
agaaagccgaatgtcattcagaacagcattgcattttatcttctcaaacctcttaagctg 

CAATT GT AT AAAATAAGAATCCT TAAT GACAGT GAT GAATT GAGGAAAAGCAAP AACAAA 
ACTAGCAAT GTCTTTGCTTGTAAGTTTCAAAAATATCTTCAt CCAAAT CTCAGTCGGTAA 

ttcaao\aaaaattcaggCgcctaca?wot 

acagcgaaaagaaaaaataacacaccgaaaaat aaaaaactcttacccct gttat ccatc 
GAGATACACAGAAAAATTCAGAACACTCAGTGT catgtttcttaaatt gttcccaaagct 

cagacattctaagccaaaaattttttgagaactgcaaaaacccagttottataacaaagc 
cTTAATGTTTTCTTAAcrwrrTAACTGCcc^^ 

cacccaggggacaaatcttgccaagaactacaagt ccataaaacaagatcctccaaatta 
taccaaaggtttctatggtcwcacaattaca^ctgacctaaaaggtgaataaagcagt 
aaataaggatgagttaaacaggccacacaatgtccagaatgtaaaaaatgctttgtttgg 
caccaacavgaccacagctgaagcaaaggaaaat^ 

CTGTTTAACACAGAACAACATTCAATT CT GGCAAACCT CTTT AAAAAATGTTTT CT GAAA 
TAXTTCTTTAAAAT GACAGT T TGCAACTCT GGAAAACACAAAATAAAAGCCGCAAT AT CT 
CTACTGCTTAAATATAAAAATATCATTGTCCAAATTTCTACTGGTAAAACTGAAAGCAXC 
TTCTTCCTATTAAAAAAAGAAAAGTGTTTTCAAATTATATTAGACTCTAACCAAAAAAAT 
TCAAATACTTTTCCTTTATAATGTACATTAAGAATAAAAATATACTCACCGTTTAAAAGT 
AGAACTTAACAGTATAATATAAATACAAGTGAGCTGAACAACGACAGCCGATTTCAGCCG 
GAGCAAAATTAAAAAGAATAAAAGGATCAAACCAACACGTAGGACAGTCTACTCCAAAAC 
AGTAACGGCAGTATGACACAGAAGGAGAGGAACTM 

ATAAAAAGTAACGCC GC C GGAAAGCAGTT GAAT ACAAAAGAGGTAAAAATTCACGAAAAA 
CAGAAGCAAAAACTACTAAATCT GCTATTGGCAAATAAAGAAAAATTTCAAACCATATTT 
CCAAAG GAAGAAAAGCAAT CAT ACCGTAGAAGAACCTGAAGGCGACCGCAAACGT GCT CC 
CGTACCACAACGTCACACGCCACACCCACTGGGAAAACCCACACGCCCCGCCTCTGTGCA 
ACOTTATATATATGAATAG 
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end OAV287/start Bluescribe sequence 



GTACC CTTTGTTCCCTTTAGTGAGGGTTAA 

TTCCGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCA 
CAATTCCACACAACATAC<jAGCCGGAAGC^TAAAGTGTAAAGCCTGGGGTGCCTAATGA<5 
TGAGCTAACTGACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGT 

ggtgccagctgcattaatgaatcgsccaacgcgcggggagaggcggtttgcgtattgggc 

GCTCrTCCGCTtCCTCGCTCACTGAC^CGCTGCGCTCGGTCGtTCGGCTGCGGCC^GCGG 
TAT CAGCT CACTCAAAG GCG GTAATAC GGT T A? CCACAGAATCAGGGGATAAC GCAG GAA 

agaacatgtsagc^aaaggccagcaaaaggccagg^^ 

CGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGA 

GGTGGC(3AAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCG 

TGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGG 

GAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGCTCGTTC 

GCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCG 

GTAACTATCGTCTTSAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAG'ZAGCCA 

CTGGTAACAGGATTAGCAGAG CGAGGTATGTAGGCGGT GCTAGAGAGTT CTTGAAGTGGT 

GGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTATCTGCGCTCTGCTCAAGCCAG 

TTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGGAAACAAACCACCGC'TGGTAGCG 

GT GGTTT TT T T GTTT GCAAGCAG CAGATT AC GCG CAGAAAAAAAGGAT CT C AAGAAGAT C 

CTTT GAT CTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTT 

TGGTCATGAGAT TATCAAAAAGGATCOT CACCTAGATCCTTTTAAA.TTAAAAATGAAGTT 

TT AAAT CAAT C T AAAGTATAT AT GAGTAAACT TGGT CT GACAGT T ACCAAT GCTT AAT CA 

GT GAGGCACCTAT CTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCG 

TCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAA'TCATAC 

CGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGG 

CCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCC 

GGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTA 

CAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAAC 

GATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTC 

CT C C GATC GTT GT CAGAAGTAAGTT GGCCGCAGT GTTAT CACTCAT GGTTAT GGCAGCAC 

TGCATAAXTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT 

CAACCAAGTCATTCTQAGAATAGTGTAT GCGGCGACCGAGTT GCT CTTGCCCGGCGTCAA 

TACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTT 

CTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCA 

CTCGTGCACCCAACTGATCTTCAGCATCrTTTACTTTCACCAGCGTTTCTGGGTGAGCAA 

AAACAGGAAGGCAAAATGCCGCAAAAAAGGfSAATAAGG^GACACGGAAATflTTGAATAC 

TCATACTCTTCCTTTTT CAATATTATTGAAGCAT^ 

GATACATATTTC^TGTATTTAGAAAAATAAACT^^ 

GAAAAGT GC CACCT GAC GT CTAAGAAACCAT TATTATCATGACAT T AACCTAT AAAAATA 
GGC GTATCACGAGGCCCTTTCGTCTCGCGCGTTTCGGT GATGACGGTGAAAACCTCTGAC 
ACAT G CAGCT C CC GGAGACGGT CACAG CTT GT CT GTAAGCG GATGC C G GGAGCAGACAAG 
CCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCTGGCTTAACTATGCGGCAT 
CAGAGCAGATT GTACT GAGAGTGCACCAT ATGCGGT GT GAAATACCGCACAGAT GCGTAA 
GGA<aAAAATACCGCATCAGGAAATTGTAAACGTXAATATTTTGT'TAAAATTCG C GTTAAA 
TTTTTGTTAAATCAGCTCATTTTTTAACCAATAGGCCGAAATCGGCAAAATCCCTTATAA 

atcaaaagaatagaccgagatagggttgagtgttgttcc^gtttggaacaagagtccact 
attaaagaacgtogactccaacgtcaaagggcgaaaaaccgtctatcagggcgatggccc 
actacgtgaaccatcaccctaatca^gttttttggggtcgaggtgccgtaaagcactaaa 
t cggaaccctaaagggagcccccgatttagagcttgacggggaaagccgccgaacgt ggc 
gagaaaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgtagcggt 
cacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgctacagggcgcgtcgcg 
ccattcgccattcaggctgcgcaactgttgggaagggcgatcggtgcgggcctcttcgct 
attacgccagctggcgaaagggggatgtgctgcaaggcgaxtaaottgggtaaxgccagg 
gttttcccagtcacgacgttgtaaaacgacggccagtgaattgt/atacgactcacrata 

GGGCGAATTCGAGCTC GGTAC 1 end of Bluescribe sequences 
Kpnl site with 5 f base 



